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STATISTICAL METHODS IN MEDICAL RESEARCH 
I. QUALITATIVE STATISTICS (ENUMERATION DATA)! 


By DonaLp MAINLAND? 


Abstract 


This article is designed to help investigators in applying to qualitatively 
classified clinical and laboratory data the appropriate statistical treatment— 
tests of significance in binomial and multinomial distributions, estimation of 
confidence limits, analysis of contingency tables, and estimation of sample sizes 
required for further investigation. Section A is a brief introduction (definitions 
and principles). Section B comprises 40 examples classified so that the investi- 
gator can choose data and problems comparable to his own. Questions that 
arise in the examples, regarding experimental design (especially random 
sampling) and the interpretation of the tests, are discussed in Section C (Notes). 

Because the standard deviation of the binomial and the chi square contingency 
test are often used without appreciation of the risk entailed, tables, which can 
be used also in nonmedical investigation, are presented: binomial confidence 
limits (with graphs) and exact probabilities for small-sample fourfold contin- 
gency tables. For samples not covered by the tables, precautions and rules 
regarding the use of chi square have been derived from more than five hundred 
comparisons between chi square and the exact method. To help in the exact 
computation of probabilities where that is necessary, four-decimal logarithms 
of factorials of numbers up to 1000 are given. 
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Preface 


This article was prepared primarily for the guidance of investigators who 
are grantees of the Medical Division of the National Research Council. Such 
investigators are commonly concerned with small samples, and, where the 
data are in the form of measurements, appropriate methods (e.g., Fisher’s ¢ 
test) are now commonly applied, at least by those who apply statistical tests 
at all. On the other hand, where data are classified qualitatively (e.g., as 
recoveries and deaths, presence or absence of color blindness, rough and 
smooth colonies of bacteria), many investigators still apply methods, such as 
chi square and the standard error of the binomial, without appreciating how 
unreliable these tests are with small samples and skew distributions. 


There are available, of course, exact tests, but these are somewhat com- 
plicated and laborious, at least to beginners. Indeed, many even find linear 
interpolation irksome. Some workers therefore recommend the simple tests, 
to be supplemented by the exact tests where greater precision is needed; but 
this still leaves the investigator in ignorance of when to apply the exact tests 
because he does not know how far the simpler crude tests may have led him 
astray. The better plan, therefore, seems to be: (1) the provision of tables 
and graphs, based on the more accurate methods; (2) the statement of limits 
within which the simple tests, especially chi square, can be trusted for 
problems outside the range of the tables; (3) the presentation, in as simple a 
form as possible, of methods for the application of exact tests where they are 
necessary. 


As these three objectives became better defined, the present project, which. 


started as an effort to prepare notes on current statistical techniques with 
medical examples, became in addition an effort to provide suitable tables and 
graphs, and to experiment with some of the simpler tests in order to define 
more precisely their reliability and limitations. 


In February, 1946, a draft of the notes and specimen tables was issued, for 
criticism by members of the Medical Research Committee and other investi- 
gators, and the response was very helpful. Besides affecting details it 
suggested a change in the plan of presentation. The plan now adopted, and 
outlined in the Introduction to this article, was devised to meet the wishes 
of those who, in the words of one investigator, would say: “I have a problem 
| a Must | spend a month of free evenings reading a book 
from end to end several times and mastering all details before deciding how 
to go about solving the problem? I hope not.” 


Some workers, of course, object to the kind of plan adopted here, because it 
appears to put into the hands of an investigator a dangerous weapon without 
adequate warning, without preliminary explanation of principles, without 
training in the design of experiments or observations, and without general 
guidance in the interpretation of the results of statistical tests. On the other 
hand, many workers have come to appreciate and apply statistical methods 
by the more direct type of approach. Having obtained statistical help in a 
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specific problem they have come to enquire about principles and have seen 
the importance of design, and, having in the first place used tables and graphs 
that reduce or simplify calculation, they have come to use more elaborate 
methods in problems that demand them. It is hoped, moreover, that in this 
article the Introduction, the Comments in the Examples, and the Notes will 
minimize the dangers of the plan of presentation; although it is doubtful 
whether any plan could entirely eliminate the beginner’s need for personal 
guidance by someone who has gone farther in statistical methods. 

Criticism and suggestions are earnestly requested from those who may 
examine or use the text, the tables, or the graphs. 


Section A—Introduction 


This article is so arranged that very little discussion stands between the 
investigator and the test that he wishes to apply. There is risk in such an 
arrangement because it suggests that mathematical tests are the most 
important part of statistical procedure. Of greater importance than the tests 
are, first, the planning of the experiment or observation so that valid inferences 
shall be obtainable, and, secondly, the interpretation of the results of the 
mathematical tests. 

Those investigators who are already acquainted with the principles of 
statistical reasoning, familiar with common terms, and convinced of the 
importance of random sampling, will presumably use the article chiefly as a 
means of facilitating their statistical tests. Other investigators will, it is 
hoped, find in the present section (Section A) a sufficient statement of prin- 
ciples to enable them to start using the tables and tests. The Summary of 
Examples at the beginning of Section B (p. 12) should help them to find the 
appropriate table, graph, or test by directing them to a problem like their own. 

The Examples sometimes illustrate bad planning, incorrect sampling, 
unproved assumptions, fallacious deductions, and other errors. These are 
discussed in the Comments in the Examples, and it is hoped that they will 
indicate the desirability of a fuller acquaintance with general principles. 
Section C (Notes) may help investigators to acquire this. 

The Examples will, moreover, probably lead the investigator to ask what 
the tables and formulae are really doing, what is their precision, and what 
methods are available for greater precision. Such matters also are discussed 
in Section C. 

The Index is intended as a guide to the main topics discussed in all three 
Sections, and the headings on each right-hand page of the article are arranged 
to facilitate reference to specific sections, examples, subsections, or notes. 


1. Two Types or Data 


At the outset the investigator should decide whether his data are, or will be, 
qualitative statistics (enumeration data) or mensuration data (measurements). 
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Enumeration Data (Qualitative Statistics ) 


Enumeration data (qualitative statistics) are records in which individuals, 
such as persons, animals, blood cells, or bacteria, have been classified according 
to certain qualities or attributes and the numbers of individuals in the various 
classes have been recorded. For example, a number of diseased men or 
animals may be subjected to a certain treatment and then classified under the 
headings ‘death’, ‘recovery’, or ‘cure’, ‘some improvement’, ‘no improvement’. 
These may then be compared with another group of individuals, subjected to 
another treatment and similarly classified. Statistical tests must be applied 
to show whether we are justified in concluding that one treatment is better 
than another, and the type of test suitable for this kind of data is different 
from the type applicable to the other main class of data—mensuration data. 


Some laboratory workers, being familiar chiefly with mensuration data, are apt to question 
the application of precision mathematics to enumeration data unless the material can be 
pode classified, e.g., as ‘dead’ or ‘living’. Conceptions of ‘mild’, ‘moderate’, and ‘severe’ 
tend to differ with different observers and even with the same observer at different times, but 
this does not mean that judgments based on ‘common sense’ or ‘general experience’ can 
replace statistical tests of such enumeration data. 


The statistical tests, far from introducing a spurious precision, commonly show that results, 
which the untested judgment of the laboratory worker or clinician would consider significant, 
could easily be accounted for by chance. 


The experiments or observations must, of course, be planned to minimize the subjective 
errors and these then must be investigated, e.g., by the observer testing the constancy of his 
criteria after a lapse of time; but this must be done also with methods of measurement, which 
are by no means so objective and unvarying as is often supposed. 


Mensuration Data 


Mensuration data are records of measurements; for example, the effect of 
treatment may be measured by change of blood pressure or urinary output. 
In most cases the actual measurements, their averages, and other derivatives 
can be used in statistical tests, and the tests differ from those, suitable for 
enumeration data, in which numbers of individuals and not measurements are 
used. 

This article is concerned with enumeration data, but the principles of 
statistical reasoning are the same for both types of data, and, as is shown in 
Examples 37 to 40, measurements can often profitably be treated as qualitative 
statistics. 


2. SoME PRINCIPLES AND DEFINITIONS 


Samples and Population 


Nearly all forms of medical research involve the use of samples (of patients, 
animals, drops of blood, urines, livers from autopsies, blood pressure readings, 
and so on), in order to obtain information regarding a population, i.e., the 
material sampled. 


Random Samples 
Before we can argue from a sample to a population we must be sure that 
the sample is chosen in the proper way, i.e., in such a way that we can make 


use of our knowledge of the relations between samples and the populations 
from which they are drawn. We have a very great deal of such knowledge 
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when the samples are chosen strictly at random. This does not mean a haphazard 
choice, or the selection of any patients or animals that happen to come along, 
or a sampling that we believe to be random because we cannot think of any 
reason why it should not be random. 


Random sampling is the kind of sampling that occurs when we deal cards 
from a thoroughly shuffled pack, or when we mix thoroughly in a box a 
thousand uniform disks (such as metal-rimmed cardboard labeling tags) and 
then take out a sample of, say, 20 disks. The factors that determine whether 
a disk is, or is not, included in a sample are numerous and independent of each 
other—the physical features of the disk, the various forces and motions 
imposed on it by gravity, by the other disks, and by the hands that mix the 
disks. In brief, the occurrence of a disk in a sample, and therefore the com- 
position of each sample, is due to chance, which can be defined as the action of a 
multiplicity of independent causes. A random sample, therefore, is one whose 
composition is determined by chance. Other definitions are: ‘‘a sample chosen 
in such a way that all individuals in the population have an equal opportunity 
of being selected”’, and ‘‘a sample chosen according to a rule that is completely 
independent of the observations to be made on the sample’. (For a demon- 


stration of random sampling see Note 1, and for practical techniques see 
Note 23.) 


It is, of course, often valuable, before random sampling, to divide a popula- 
tion into classes. Knowing, for instance, that people (or animals) of different 
race (or animal stock), sex, and age, often differ in their reactions to disease, 
to therapeutic treatment, or to experimental treatment, we often divide the 
original population accordingly. Each of these subclasses thus forms a 
population to be sampled. In such cases there is at first purposive sampling 
(division into subclasses), but the final sampling must be strictly random if 
we are to draw valid conclusions. 


Statistical tests are methods of drawing such conclusions by showing what 
allowance must be made for the differences that occur among random samples 
of the same population, i.e., differences due to chance. 


To show how this statement applies to qualitative statistics, it is best to 
consider first a population whose composition we know, e.g., a thousand 
persons (500 males and 500 females) and thereby to illustrate the terms 
‘frequency’, ‘probability’, ‘odds’, and ‘chances’. For concreteness we suppose 
that each individual is represented by a disk marked ‘male’ or ‘female’. 


Frequency 


Frequency is the number of individuals of a certain class in a population or 
in a sample; e.g., 500 males in a population of 1000. In a sample of 10 from 
that population the males may have a frequency of 4 and the females a 
frequency of 6. Frequencies are often expressed as percentages, e.g., 50% 
males; but it will soon become obvious that statements of percentage frequencies 
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in samples, without a statement of actual numbers of individuals, are useless. It 
is sound advice to beware of percentages.* 


Probability, Chances, and Odds 


If there are 500 males and 500 females, when we pick an individual strictly 
at random our expectation of picking a male is equal to our expectation of 
picking a female. The probability of picking an individual of a certain class 
by random sampling is defined as the relative frequency of individuals of that 
class in the population, i.e., the frequency divided by the total population. 
Here the probability of picking a male is 500/1000 = 3} or 0.5. The chance 
of picking a male is one out of two, and the odds for male and female are 
even—one to one. 

If, in a population of 1000 persons afflicted by the same disease, 700 recover 
and 300 die, and we represent this by marking 1000 disks appropriately R or 
D, the probability of picking an R disk is 700/1000 = 0.7, and the probability 
of Dis0.3. The sum of the probabilities of all classes, here R and D, must be 
1. The odds in favor of an R disk (against a D disk) are 7 to 3. 

A probability indicated by the capital letter, P, is frequently used in tables 
and tests in order to show whether a result such as we have observed in a 
certain sampling experiment would be found frequently or rarely by random 
sampling of a particular population. Definitions of P are given in Section C 
(e.g., Note 3), but in this Introduction it is probably preferable merely to 
indicate its general meaning and how it is used in the two main types of 
problem in qualitative statistics. 


Types of Problem 


The types of problem in qualitative statistics are mainly: (1) argument 
from a sample to its population, and (2) comparison of two or more samples in 
order to decide whether they were probably drawn from the same population 
or from different populations. Both these types of problem may lead to 
further questions, e.g.: How large must the sample be before the conclusions 
desired can be drawn? 

The Examples in Section B are classified according to the type of problem; 
but here we must glance at the general method of reasoning in the two main 
types. 

3. ARGUMENT FROM SAMPLE TO POPULATION 
Levels of Significance 


Let us suppose that we have 20 patients suffering from a certain disease 
and treated by a certain method. Nine recover and 11 die. We wish to 
know whether, by continuing to treat such patients by the same method, our 
recovery rate might be as high as 70%. We consider our 20 patients as a 
random sample of the population that we should create by continuing to use 
the treatment. By methods discussed in Section C, Note 2, we can show that, 


* A laboratory worker who is used to low b techniques analysis 
0 or 80% between samples of enumeration 


sometimes finds it difficult to realize that a difference of 
data may have no significance at all. 
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if the population recovery rate is 70% recoveries, the probability of 9 or fewer 
recoveries in a random sample of 20 is specified by P = 0.017. Adopting the 
standards usually applied in such cases we say: ‘‘P is less than 0.025; there- 
fore the frequency in our sample (9 recoveries in 20, i.e., 45%) is significantly 
different from (lower than) 70%, or the difference between the recovery rate 
in our sample and 70% is significant ’’. 


‘Significant’ means ‘suggestive’ or ‘indicative’ of a real difference. When 
we give that verdict in the present example we mean that we believe that, if 
we obtained more and more (or larger and larger) samples of the material 
represented by the given sample, we should approach a percentage of recoveries 
that was different from (lower than) 70%. The difference might be either 
greater or less than the observed difference (70 minus 45 = 25%), but we 
feel justified in believing that it would not be zero. Expressed in another way, 
we believe that chance (random sampling variation) will not account for the 
difference. In still another form, we say that the true (population) percentage 
of recoveries is unlikely to be as high as 70%. 


If our original sample of 20 had contained only 7 recoveries and 13 deaths 
we should find by the same methods (Section C, Note 2) that, if the population 
value were 70% recoveries, P for our sample would be 0.001. Again adopting 
the usual standards we say: ‘‘P is less than 0.005; therefore the difference is 
highly significant (or very significant)". We feel in this case stronger confidence 
that chance will not account for the difference, and we say that the true 
percentage is highly unlikely (or very unlikely) to be as high as 70%. 


Finally, if P is more than 0.025 we say that the difference is not significant. 
This is a verdict of ‘not proven’. The evidence is not sufficient to make us 
believe that a real difference exists. Thus, if our sample of 20 had contained 
10 recoveries we should, in a population of 70%, find that P for that sample 
would be 0.048, which is greater than 0.025. Therefore the difference would 
not be significant; it might well be due to chance. We could say therefore 
that the population value was not unlikely to be 70% recoveries. 


It is shown in Section C, Note 4, that when we say that P = 0.025 we 
mean that only 23% of random samples in a particular population will lie 
so far (and farther) away from the true (population) value, above or below. 
Considering also the 2}% of samples at the opposite end (below or above) 
we have a total of 5% of the samples, or 1 in 20; i.e., the odds against finding 
such samples are 19 to 1. For P = 0.005 the corresponding figures are 3% 


of the samples below the true value and 3% above—a total of 1% or odds of 
99 to 1. ' 


To contrast the two levels of significance we speak of the one with P = 0.025 
as the 24% level of significance, and of the one with P = 0.005 as the 4% 
level of significance. 


For discussion of the reasonableness of these significance conventions see 
Section C, Note 5. 
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Confidence Limits 

A given sample may belong either to (1) a population with a higher per- 
centage than the percentage shown by the sample, or to (2) a population with 
a lower percentage than the sample percentage; but, using the probability 
criteria (P = 0.025 and P = 0.005), we can state, with confidence corre- 
sponding to those probabilities, the limits beyond which we dy not believe 
that the true (population) percentage lies. Tables I (A and B) and II enable 
us to find such limits. For instance, if we have a sample of 20 patients with 
9 recoveries and 11 deaths, we turn up Table IB under ‘‘Number of A’s in 
sample = 9”, and look along the line N = 20. Under ‘Upper limits, 
P = .025” we find 68.5%. This means that the upper confidence limit at 
the 23% level (P = 0.025) is 68.5%. On the same line we see that the 
upper limit at the 3% level (P = 0.005) is 74.3%. We can express these 
findings in words by saying that the proportion of recoveries, in the population 
of which our sample is a random sample, is unlikely to be more than 68.5% 
and very unlikely (or highly unlikely) to be more than 74.3%. 


The lower limits are likewise shown, and we can say that the population 
percentage is unlikely to be outside the range or confidence belt, 23.0 to 68.5%, 
but, if we adopt the usual rule, we are not prepared to estimate it any closer 
than that. It may lie anywhere within that belt. However, in some cases it 
may be sufficient or desirable to accept a narrower belt, 29.3 to 61.5% at the 
10% level (P = 0.1), but our confidence here is low. (For further discussion 
see Section C, Note 6.) 


4. COMPARISON OF Two oR More SAMPLES 
Arrangement of Data 


Let us suppose that a random sample of nine individuals from a certain 
population, e.g., a population of young adult male human beings with a certain 
disease, receives Treatment V, and another random sample of six from the 
same population receives Treatment W. The observations are recorded in a 
fourfold table in which actual numbers, not percentages, are given: 


Treatment Recoveries (R) Deaths (D) Total 
V 2 9 
Ww 5 1 6 
Total 7 8: 15 


Many different kinds of data can be presented by such a table. For 
example, among nine women, attacks of a certain disease might be severe in 
seven, mild in two, whereas among six men they might be severe in one and 
mild in five; or a certain muscle might be present in only two of nine male 
white cadavera, and in five of six male Negro cadavera. A table that shows 
the joint occurrence of two sets of attributes or qualities (e.g., type of treatment 
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and outcome of disease, sex, and severity of attack) is a contingency table 
(Latin, con, together; tangere, to touch). Each of the two sets of attributes 
may have two or more components, e.g., ‘mild’, ‘moderate’, ‘severe’. The 
simplest form, as in the present example, is a fourfold contingency table. 
For analysis, its contents can be looked on either as two ‘treatment’ samples 
divided according to outcome (recovery or death) or as two ‘outcome’ samples 
divided according to treatment. The results of the analysis will be the same. 


Methods of Analysis 


Methods of analysis of such data are shown by numerous examples in 
Section B. For very small samples (up to 20 individuals) the results can be 
obtained directly from Tables IV or V, and for larger, but equal, samples 
Table VI provides some information. In other cases simple mathematical 
formulae are used. Whatever particular technique is appropriate, the under- 
lying reasoning can be illustrated by reference to our samples (Treatments V 
and W) as follows. 


Let us suppose that there is no real difference between the effects of Treat- 
ments V and W. If, then, we were to take more and more samples we should 
find that some showed V apparently better than W, while others showed W 
apparently better than V, but when we combined the results we should find 
that the recovery rates for the two treatments became more and more alike, 
and we should become more and more convinced that, in respect of recovery 
rate, the samples were all from one population. Let us therefore apply this 
supposition to our actual samples of nine and six patients, and find out how 
frequent or how rare would be the occurrence of such differences as we have 
observed. 


As before, we use the probability P as a measure of this and we adopt the 
same standards of significance. Thus, it will be shown in Example 15 that, 
if our samples came from the same population, P would be greater than 0.025, 
i.e., we should expect, as a result of chance alone, to find more than 23% of 
samples showing as much, or more, evidence in favor of Treatment W. 


We therefore say that there is no significant difference between the two 
samples. We have not sufficient evidence to say that the Treatment W is 
better than Treatment V, and, for all we know, Treatment V may be better 
than Treatment W. 


If P had been between 0.025 and 0.005 we should have said that there was 
a significant difference, and we could express this result in other ways also: 

(1). We do not believe that chance (random sampling variation) is respon- 
sible for the difference. 


(2). There is a significant association between difference in treatment and 
difference in recovery rate (proportion of recoveries). 


(3). There is a significant heterogeneity or a significant lack of homogeneity 
between the samples, i.e., we believe that they are not from the same ‘material’ 
or population. 
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Levels of Significance 4 


If our samples had shown a difference in favor of Treatment V we should 
have applied the same standard, P = 0.025, referring to the 23% of samples 
in which differences of that type occurred. As in the argument from sample 
to population (p. 8) we are therefore excluding a total of 5% of samples, 
with odds of 19 to 1 against finding such samples (V better than W, or W better 
than V) if there is no real difference between the effects of the two treatments; 
and similarly, with P = 0.005 we are excluding a total of 1% of samples 
(odds of 99 to 1). 


One-sided Comparisons 


Sometimes, of course, we are concerned with differences in one direction 
only. For example, we may try the effects of a treatment by comparison 
with a sample of control animals. We may know that the treatment cannot 
do any harm, and we ask: Is treatment significantly better than no treatment? 
This is a ‘one-sided’ comparison, in contrast to the ‘two-sided’ comparison of 
Treatments V and W. As is shown in Section C, Note 5, P values of 0.05 
and 0.01, instead of 0.025 and 0.005, would be the logical criteria in one-sided 
comparisons, but this complication is often unnecessary, and we shall in most 
instances treat one-sided comparisons as if they were two-sided. We are 
thereby merely raising our standards of significance somewhat for the one- 
sided comparisons. 


(For further discussion of contingency tests see Section C, Note 12.) 


5. RANDOM SAMPLING 
Importance of Random Sampling 


The object of the experiment discussed in Subsection 4 was to see whether 
differences in treatment would or would not create a significant difference 
between the samples—a difference greater than could be reasonably accounted 
for by random sampling. Therefore obviously it is no use analyzing the data, 
or conducting the experiment at all, unless we have allocated the treatments 


‘to the two samples of patients by strictly random methods. When we are not 


experimenting but observing, e.g., the sex incidence of severity of disease, or 
the racial frequency of a muscle, it is often difficult to generalize because we 
often cannot be confident that a hospital or anatomy laboratory has provided 
us with random samples. 


Simplicity of Techniques 

Although proper techniques are now commonly used in many types of 
investigation that are very like those of clinical and laboratory medicine, some 
medical investigators still feel that the techniques are needlessly complex 


and artificial. A little experience will show how simple and practical they 
are—see Section C, Note 23, for examples. 


Equality of Numbers 


Random sampling is sometimes easier if equal numbers of individuals are 
chosen for each sample, but the most important reason for choosing equal 
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samples is that, for a given total number of individuals, equal samples provide 
the most sensitive tests, i.e., they give the greatest amount of information. 
However, unequal samples are sometimes desirable. For example, a certain 
form of treatment may be expensive, time-consuming, difficult, disagreeable, 
or even dangerous to the subjects. It may then be desirable to treat fewer 
individuals by this method and use more in the other group. 


Bias and Sample Size 


Many workers believe that small samples are specially apt to be biased- 
For one who holds this belief, the best way, probably the only way, to 
appreciate its fallacy is to become acquainted with statistical reasoning and 
methods. He will then see that the only difference between small random 
samples and large random samples is that the small samples do not give as 
precise information regarding the sampled population, and that, however 
small the sample may be, the degree of this precision can be estimated. He 
will further realize that unequal samples, when properly treated, do not 
introduce bias. 


Section B—Examples 
SUMMARY OF EXAMPLES 


1. Argument from a sample to a population. 

(1) N = total number of individuals in sample. Two classes—A, 
not-A. (Classify as ‘A’ the individuals that form not more than 
half the sample.) Example 1 for general methods, interpretation, 
and comments. 

(a) Number of A’s in sample: 0 to 20, whatever the value of N. 
Examples 1 and 2. Table IA and IB. In addition: 
(i) N in sample between two N values in the tables. 
Examples 3 and 4. 
(ii) To answer the question: Does the sample indicate 
that A’s form less than half the population, i.e., that 
not-A’s are in the majority? Examples 5, 6, and 24. 
(iii) Number of A’s in sample = 0. Example 7. 
(6) Number of A’s in sample more than 20. Example 8. Table 
II; Graphs 1 to 3 (= Figs. 6 to 8). In addition: 
(i) N or percentage of A’s in sample between values in 
Table II. Examples 9 and 10. 
(ii) To answer the question of majority and minority as 
in (a) (ii), above. Example 11. 
(iii) Percentage of A’s in sample less than 5. Example 
12. Tables II and I}I; Graphs 4 to 6 (= Figs. 9 
to 11). é 


(2) More than two classes in sample—A, B, C, etc. Example 13. 
Chi square. 
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2. Comparison of samples. 


(1) Comparison of two samples, each divided into two classes (a four- 
fold contingency table). 

(a) Two equal samples, from 1 to 20 individuals in each. 
Example 14. Table IV. 

(6) Two unequal samples containing up to 20 individuals in one 
sample, up to 19 in the other. Examples 15 to 18. Table V. 

(c) Two equal samples containing 20 to 1000 individuals in each. 
Example 19. Table VI. 

(d) Any pairs of samples, equal or unequal, containing more 
than 20 individuals in either. Examples 20 to 23, 25, 26. 
Chi square. 


(2) Comparison of more than two samples and of samples with more 
than two classes (contingency tables larger than fourfold). 
Examples 27, 28, and 29. Chi square. 

3. Combination of information from two or more samples. Examples 30 
and 31. 
4. Confidence limits for differences between samples. Example 32. 
5. Sizes of samples required. 
(1) In estimation of confidence limits. Examples 33 and 34. 
(2) In comparison of samples. Examples 35 and 36. 
6. Measurements treated as qualitative statistics. Examples 37 to 40. 


1. ARGUMENT FROM SAMPLE TO POPULATION 


(1) TWO-CLASS SAMPLES 
Example 1 


The same numerical problem can take many forms. Five variants are given. 


(1). A physician treats 25 patients suffering from a certain disease and 15 
die (60%). In this disease the mortality, based on reports of thousands of 


cases, is 40%. What right has the physician to think that-his sample is 
exceptional? 


(2). Twenty-five sailors, tested by motion in a rocking machine, were 
pronounced susceptible to motion sickness, i.e., they always developed 
symptoms when so tested. A certain drug was administered, the test was 
repeated, and 15 (60%) were found to be immune to the motion. What 
confidence can we have that the true percentage, approached by performing 
the same experiment on thousands of similarly susceptible men, would be 
more than 40%? In more general terms, between what limits should we 
expect the true percentage to lie? 


(3). A certain inoculation invariably produces disease in unvaccinated rats. 
In 25 rats, previously vaccinated, 10 (40%) develop the disease and 15 (60%) 


| ae 
i 
| 
| 
‘ 
| 


14 CANADIAN JOURNAL OF RESEARCH. VOL. 26, SEC. E. 


do not. The experimenter says: ‘If this vaccine is unlikely to protect more 
than 75% of the rats, I do not wish to try it on hundreds of animals. I prefer 
to set about developing another vaccine.’ 


(4). A pharmacologist injects a substance in 25 dogs and produces an effect, 
such as vasoconstriction or death, in 15 of them. He asks: ‘‘What is the 
error in this estimate (60%), i.e., what might be the percentage if I performed 
the same experiment on more and more dogs of the same kind?”’ This variant 
indicates how the methods to be discussed here are applicable in biological 
assay, including toxicity tests. 


(5). The combined pedigrees of several families show the occurrence of a 
certain disease or defect in 15 of 25 children. If a certain genetic mechanism 
is at work the ratio to be expected is 3:1. Does the sample of 25 agree with 
this hypothesis? 


Method 


In each variant there is a sample of 25 with 15 in one class and 10 in the 
other. Call the class that contains less than half the total ‘A’ because, to 
save duplication, Tables IA, IB, and II are arranged for numbers up to half 
the total. The number of A’s is 10; therefore use Table I[B—number of A’s: 
1 to 20. (For procedure where the number of A’s is over 20 see Example 8.) 
Under heading ‘“‘No. of A’s in sample = 10”, find the confidence limits along 
the line N = 25. (For procedure where N lies between two values in the 
table see Examples 3 and 4; for N greater than 1000 see Example 4.) 


The wide limits (P = 0.005) are 16.8 and 67.0%, the medium limits 
(P = 0.025) 21.1 and 61.3%, and the narrow limits (P = 0.1), which may 
be needed for some purposes, are 26.5 and 54.8%. 


Since the problem is stated in terms of the other class (not-A) subtract each 
of these from 100%, to give: 

Wide limits (P = 0.005)—83.2 and 33.0%, 

Medium limits (P = 0.025)—78.9 and 38.7%, 

Narrow limits (P = 0.1)—73.5 and 45.2%. 


Interpretation 


Variant (1).—The true mortality may be between 38.7 and 78.9%, with- 
out the sample being exceptional. If the physician thinks that his sample 
(60% mortality) indicates a mortality higher than 38.7% he is adopting 
standards that will lead him astray on more than 5% of the occasions on which 
he estimates upper and lower limits, and if he thinks that his sample indicates 
a mortality higher than 45.2% he is adopting standards that will vitiate more 
than 20% of such estimates (see Section A 3). 


Variant (2).—Accepting the usual standards of judgment, we are not 
justified in assuming that, if thousands of similar men were examined, more 
than 38.7% of them would show immunity after the drug; and we are even 
less justified in assuming an immunity of more than 40%. 


| 
| 
) i 
| 
| 
| 
fh 
| 
i 


MAINLAND: STATISTICAL METHODS—SEC. B, EX. 1 15 


Variant (3).—There is no significant difference between the sample frequency 
(15 in 25) and 75%, for there is no adequate reason to suppose that the 
vaccine will protect fewer than 78.9% of such rats (or more than 38.7%). 

Variant (4).—The pharmacologist can state, with confidence represented by 
odds of 19 to 1, that the true percentage is unlikely to be outside the range of 
38.7 to 78.9, and he can state, with confidence represented by odds of 99 to 1, 
that it is unlikely to be outside the range 33.0 to 83.2. 

Variant (5).—The ratio 3:1 means 3 out of 4,‘ie., 75%. There is no 
significant difference between the sample frequency (15 in 25) and 75%. The 
observation does not disagree with the hypothesis. 

Comments 

It is perhaps desirable to point out that two types of problem have been 
illustrated here: 

(a). A test of the significance of the difference between a sample and a 
(real or hypothetical) population value—Variants (1) and (5). 

(6). An estimation of confidence limits—Variant (4). The other variants 
have elements of both types. 

To the investigator, however, the distinction may be unimportant, because, 
as has been shown, the desired information has been obtained by the same 
technique for each variant. 

Comment on Variant (1).—Mortality in 25 patients. A percentage or ratio 
derived from a sample, even if the sample contains thousands of individuals, 
is not the same as a postulated percentage or ratio, as in (2) or (5). The 


_ postulate may be purely hypothetical or may be derived from observed 


phenomena (e.g., dice-throwing or genetic mechanisms), but it is always 
exact. It can be looked on as a ‘true’ value, derived from a population that 
is infinite, i.e., as large as we care to make it; whereas an actual (finite) 
sample, however large, can only give an estimate. The distinction is often 
unimportant if the sample contains several thousands, but percentages are 
frequently given in textbooks and other secondary sources without sample 
size, although the latter, on further search, turns out to be quite small. 


The proper method of treating the physician’s sample of 25 would therefore 
be comparison with the sample that gave the reported 40% mortality. How- 
ever, as a preliminary step we can treat the 40% as a postulated ‘true’ value, 
as was done above, and then: 

(a). If the sample shows no significant difference from this postulated value 
we can be assured that comparison of samples would show still less evidence 
of a difference, because the postulated ‘true’ value is equivalent to the value 
for an ‘infinitely large’ sample. 

(6). If the observed sample shows a significant difference from the postulated 
percentage, we must find the size of the reported sample. If it is 1000 or . 
more it will often be safe enough to take its frequency as a population 
frequency; but it is safer, and not difficult, to compare the observed sample 
with the reported sample. (See also Examples 3 and 26.) 
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(c). If there is a significant difference between the samples we must be 
extremely cautious in drawing conclusions regarding the causal factors. For 
example, hospitals tend to receive more severe cases than those treated in 
private practice, and their data vary with changes in the customs of people 
regarding hospitalization for a certain disease and with changes in availability 
of hospital space. Statistics from Departments of Health suffer from many 
weaknesses, e.g., differences in diagnoses between different individual 
physicians, between the physicians of different regions, and at different periods. 


Comment on Variant (2).—The design of the experiment on motion sickness 
is unsatisfactory and thereby illustrates an important feature common to a 
wide variety of experiments. To be very confident that a man who had 
shown susceptibility, even in all previous tests, would show it if he did not 
receive the drug, would require many men and perhaps many tests on each. 
The results of these experiments would have to be expressed with an estimate 
of error (as a confidence limit or in some other form), and this estimate, should 
be allowed for in the test of the drug. <A well-designed experiment contains in 
itself its estimate of error, and a procedure such as the following could be 
suggested : 


(a). If it is desired to select the apparently more susceptible men, do so by 
a preliminary test. 


(b). Allocate in advance to the selected group, strictly at random, treatment 
and nontreatment. This takes care of all such factors as variable degrees of 
susceptibility, activities before the test and time of day when the test is made. 
Even loss of men owing to unforeseen circumstances (unless associated with 
their susceptibility..to motion sickness) does not introduce a bias, although it 
removes the advantage of equal numbers. 


(c). Compare the two samples as in the Examples indicated under that 
heading in the Summary of Examples (Subsection 2). 


-(d). If there is a significantly greater immunity in the drug-treated men, 
estimate for them the confidence limits as shown in the present example. 
The answers will show estimates of immunity after the drug, not necessarily 
wholly due to the drug, even if the nontreated men, in the particular experi- 
ment, all developed motion sickness. 


Comment on Variant (3).—Vaccination of rats. See Comment on Variant 
(2). The same experimental design, with strictly random sampling, should be 
used, to randomize the variations due to such factors as constitutional differ- 
ences between rats of different litters, differences in response to inoculation, 
and numerous environmental differences, e.g., in cage positions. The investi- 
gator may say that such factors can have only a slight effect, if any. He 
nevertheless recognizes that rats differ in their response to inoculation and 
vaccination, but is ignorant of many of the factors responsible. Randomiza- 
tion automatically distributes these factors, so that the differences between 
the vaccinated animals and controls are, except for the vaccination, due to 
chance. . 
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Comment on Variant (4).—Vasoconstriction in dogs. The data are 
enumeration data (qualitative statistics—vasoconstriction present or absent), 
which may be sufficient for the investigator’s needs, perhaps as a preliminary 
survey; but where measurement is possible (e.g., amount of injected sub- 
stance, degree of vasoconstriction) mensuration data are obtained and, in 
general, they give more information than do enumeration data. 


Comment on Variant (5).—The sample does not disprove the hypothesis— 
the genetic mechanism that is responsible for a 3:1 ratio. But this does not 
prove that the hypothesis is correct. Reference to the confidence limits 
already obtained will show that: 


(a). The hypothesis of a 2 : 1 ratio (66.7%) is not disproved by the sample, 
because, even for P = 0.1, the true value may be 73.5%. 

(b). A 13 :3 ratio (81.25%) is unlikely (P less than 0.025), but not very 
unlikely (P less than 0.005). 

(c). A 1:2 ratio (33.3%) is unlikely; and is almost as low as the lower 
limit, 33.0% for P = 0.005. 

Where there is a limited number of hypotheses as here, we can, by increasing 
the sample size, disprove one after another with the degree of confidence that 
we desire, until only one remains, which therefore is the most likely to be 
correct. 


Example 2 


An anatomy textbook states that the posterior ethmoidal nerve is found in 
only 30% of orbits. The sample size is not given, as it ought to be if the 
percentage is to have any value, but the original report revealed that it was 10. 
How far is the generalization (30%) justified? 


Method 


In Table IB, under ‘Number of A’s in sample = 3’’, for N = 10 find the 
medium confidence limits (P = 0.025)—6.7% and 65.2%. 


Interpretation 


For the various populations, of which this could be a random sample, the 
percentage might lie anywhere between the indicated limits. 


Comment 


Although these particular data on nerve frequency are intrinsically of 
little importance, they illustrate a weakness that should be guarded against in 
more important problems. In bilateral organs or parts (kidneys, #rbits, 
limbs, etc.), because of the varying tendency toward similarity of the two 
sides, reports, and calculations based on them, should make the distinction 
clear. Thus, if there were here five cadavera (10 orbits) with two nerves in 
one and one nerve in a third, this would not afford an estimate for a random 
sample of orbits unless there were no tendency to bilateral symmetry of nerves. 
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Example 3 


In a class of 98 male medical students eight are color blind. Accepting a 
textbook statement that 4% is the common incidence of color blindness in 
males, are we justified in expressing surprise at this ‘100%’ excess? 


Method 


In Table IB, under ‘‘Number of A’s in sample = 8’, find the lower 
confidence limit (P = 0.025) for N = 90 (3.9%) and N = 100 (3.5%). 
For N = 98 the limit must be about 3.6%. Our sample shows no significant 
difference from a 4% population value. 


Comment 


For the danger of using reported percentages as if they were true (popula- 
tion) percentages, see Example 1—Comment on Variant (1), and Example 26. 
If the test had shown a significant difference and we wished to pursue the 
problem, we should find the records of the samples underlying the textbook 
statement, and compare those samples with our own. If ‘this showed a 
significant difference we should be justified in seeking causes, but we should 
bear in mind three things: 


(1). We may not have been counting the same thing as the other observers, 
i.e., their definitions of, or tests for, color blindness may have been different 
from ours. 


(2). Let us suppose that a physiologist found in his class of 98 students a 
frequency of color blindness that would occur by chance once in, say, 30 such 
samples from a population with 4% frequency. With a new class of the same 
size every year he should expect to meet such a chance occurrence once in a 
teaching career of 30 years. 


(3). ‘The ‘one chance in a million’ will undoubtedly occur, with no less 


and no more than its appropriate frequency, however surprised we may be 
that it should occur to us’’—Fisher (10). 


Example 4 


The same problems as in Example 1, but with NV = 102, number of A’s in 
sample = 10. In that section of Table IB, the required confidence limits 
must lie between those for V = 100 and those for N = 110. In. many cases 
this will be sufficient information. 


If a somewhat more precise estimate is desired, note that 102 is one-fifth 
of the way from 100 to 110, and interpolate mentally, giving, for example, 
about 3.7% for the extreme lower limit and a little under 20% for the extreme 
upper limit. 

If a still more accurate estimate is needed, divide 102 (the observed N) by 
100, to obtain 1.02, and divide this into the values for 100, to obtain 3.7% 
for the extreme lower limit and 19.8% for the extreme upper limit. 
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If N in the sample is, say, 108, i.e., nearer to 110 than to 100, divide 108 
into 110, to get 1.02, and multiply 1.02 by the tabulated values for NV = 110. 


If N is halfway between two tabulated values of N, either of the above 
methods of interpolation can be used. 


Note how these rules apply at the foot of the table. One can find the limits 
for N = 1000 by dividing 500 into 1000, to give 2, and then dividing the 
values for 500 by 2. 

If N in the sample is greater than 1000, divide the values for 1000 by N/1000. 


Example 5 

Among eight bubonic plague patients who had meningitis six were male 
and two were female (18). Is it justifiable to say that males seem to be more 
susceptible? 

This is an example of a common form of misleading data (see Comments), 
but we can use the figures to show how to answer the question: Is there any 
evidence of a significant majority of males? 


Method 


Let A be the number of females. In Table IB, under ‘‘Number of A’s 
in sample = 2’, for N = 8 the upper confidence limits are 53.9% (P = 0.1) 
and 65.1% (P = 0.025). There is therefore no reason to believe that further 
investigation would not show that the females were in the majority. 


Comments 


Whether the difference was significant or not we could tell nothing about 
the relative sex ‘incidence, because we do not know the proportions of the 
sexes in the population of plague patients from which the sample was drawn. 
The proper method of analyzing would be to divide the plague patients into 
two samples—(1) males: number with meningitis, number without meningitis; 
(2) females: number with, number without. Comparison of the two samples 
would show whether there was a significant association with sex. 


Data presented as in this example can be misleading. For instance, in 
312 patients with hookworm disease 55.8% were men and 44.2% were 
women (12). If one takes these data as suggesting that, in general, the male 
incidence is higher than the female, one is making the unwarranted assumption 
of male-female equality in the general population and in the part of it available 
for this survey. 


Another example of the same kind of presentation occurs in a report on the 
incidence of tuberculosis in the naval service (17). Of the total number of 
cases, 5.1% were females, 8.6% were officers, 52.5% were seamen, etc., 
12.9% were engine room workers, and the remainder were in other groups. 
Without a statement of the total numbers employed in each group no con- 
clusion can be drawn regarding the relative incidence in the different groups; 
and yet a reader, meeting such data, is very apt to draw some such conclusion. 
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Example 6 


Two treatments, V and W, have been applied to each of 10 animals, e.g., 
two methods of treating experimental wounds or burns. The results are: 


Animal No. 
1 — No appreciable difference 
2 — W better than V 
3 — W better than V 
4 — No appreciable difference 
5 — W better than V 
6 — V better than W 
7 — W better than V 
8 — W better than V 
9 — No appreciable difference 
10 — W better than V 

Is there evidence that W is really better than V, i.e., is there a significant 
majority in which W is better? 

Similar data might represent 10 specimens of a certain microorganism, each 
divided for growth on two plates of culture medium, a growth-inhibitor having 
been added to one plate of each pair (or one growth inhibitor to one plate and 
another to its fellow). 


Method 


Nos. 1, 4, and 9 tell us nothing about the possible differences. In the 
remaining seven, six show W better than V, and one shows V better than W. 
In Table IB, under ‘‘Number of .A’s in sample = 1”, for N = 7 find the 
upper limit (P = 0.025)—57.9%. 


Interpretation 


For all that the sample proves, if we increased the number of such obser- 
vations we might find that, among those that showed an appreciable difference, 
V was better than W in more than 57%—the opposite of what the sample 
suggests. Therefore there is no adequate proof that W is really the better 
treatment. 


Comments 


(1). It is desirable where possible in such cases to apply methods of 
measurement, e.g., number of days to wound healing, size or number of 
bacterial colonies, or even a method of grading the effects, e.g., by 1, 2, 3, 4, 
etc., provided that the system of grading is uniform. Methods of testing 
mensuration data can then be applied and may display significant differences 
where enumeration methods do not. For example, in the animal where V 
was better than W the difference might only be slight, whereas W might be 
much better than V in several animals. 

The test applied above would, however, be appropriate where there is no 


satisfactory system of measurement, or to compare gross or qualitative effects, 
e.g., colonies and no colonies on culture media. 
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(2). It has already been shown that there is no significant difference between 
our sample of seven and a true (population) value of 57%—a majority of 7% 
in favor of V. Therefore, a fortiori, there is no significant difference from a 
population percentage of 50—equal effects of V and W. It might be asked: 
“If there were no real difference between V and W, how often would samples 
showing as great an apparent difference in favor of W be found by chance?” 
By the method of Section C, Notes 2 and 3, it can be proved that P for our 
sample = 0.0625, i.e., far greater than 0.025. We should expect 6.25% of 
samples of seven to show as great a difference as observed, or greater. 

(3). The data might represent 10 pairs of animals, Treatments V and W 
having been allotted at random within each pair; but one must beware of 
artificial pairing, because much information can be thereby lost or obscured. 
Pairs of animals from each of 10 litters could be used, provided that there was 
good reason to believe that their reactions to the particular disease or experi- 
mentally induced condition, for which Treatments V and W were to be tested, 
would be more alike than the reactions of animals selected purely at random. 
Again, in testing a vaccine as a possible preventive of infection in human 
beings, it may be desirable to make use of the fact that within a family there 
is similarity in risk of exposure to the disease and in other environmental 
features. Within each family, of course, the treated and control members 
must be chosen strictly by a random sampling method. 

(4). The hypothesis tested here was that, among animals showing an 
appreciable difference under the two treatments, the true (population) ratio, 
(W better than V) :(V better than W), is 1:1. Animals 1, 4, and 9 were 
omitted because the hypothesis does not tell us how many animals to expect 
in which no appreciable difference could be detected. The whole sample of 10 
would, however, be taken into consideration if other questions or hypotheses 
were to be considered; e.g., if we asked: ‘What are the confidence limits for 
the percentage showing an appreciably better result with W?”’ In that case 
the other class would contain (1) those that showed a better result with V, 
and (2) those that showed no appreciable difference. 

(5). The same figures as in this example might have been obtained by 
comparing animals, subjected to a certain treatment, with their untreated 
litter-mates. If we knew that the treatment could not be worse than no 
treatment, we should be concerned only with samples in which treatment was 
apparently better than no treatment—a one-sided comparison, mentioned in 
Section A4, and discussed in Section C, Note 5. The test of significance, 
applied to the two-sided comparison in the present example, need not be 
altered, and if the observer wished to use P = 0.05 as a criterion there would 
still be no significant evidence that treatment was better than no treatment, 
for P = 0.0625. 


Example 7 


A certain blood substitute, transfused into 80 patients, produced no unfavor- 
able reaction. It is possible that, transfused into all the present and future 
inhabitants of the world, it would produce no reaction, but we wish to know 
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how much confidence we may derive from the sample of 80, i.e., what propor- 
tion of individuals might react if we transfused this substance into more and 
more of the people represented by our 80 as a random sample. 
Method 
The number of A’s in the sample is zero. Therefore use Table IA. For 
N = 80, the upper limits are 4.5% (P = 0.025) and 6.4% (P = 0.005). 
Interpretation 


Extensive further investigation might reveal that the percentage of 
persons presenting unfavorable reactions was as high as 4.5%, although we 
can feel reasonably confident that it is unlikely to be more. In a problem of 
this kind, however, we are more inclined to demand the greater assurance 
offered by the 3% limit (P = 0.005) and, if so, we must say that we do not 
trust the percentage of unfavorably reacting individuals to be less than 6.4. 

Comments 


(1). These statements apply to a population of which the 80 patients 
would be a random sample. By increasing the size of the sarhple from the 
same type of patient (the same population), if we still found novreactors, we 
could reduce the upper confidence limit as far as we wished; but it would 
obviously be more useful to increase the variety of patients at the same time 
as we increased our sample size. 

(2). Table I shows that, even if we have no A’s in a sample, we require 
a sample of six before we can be reasonably confident (P = 0.025) that the 
population contains fewer than 50% A’s, and we require a sample of eight 
before we can be very confident of this (P = 0.005). 

Example 8 


In the problems of Example 1, let us suppose that the total number in the 
sample is 80, with 48 (60%) in one class and 32 (40%) in the other. Calling 
the less numerous class ‘‘A’’, we find 32 A’s, and as this is more than 20 we 
use Table II, or Graphs 1, 2, and 3 (= Figs. 6, 7, and 8). 


Under ‘Percentage of A’s in sample = 40” find N = 80 and read off the ° 


limits. Alternatively, in Graphs 1, 2, or 3, find N = 80 at the foot, run up 
to the lines for 40% and find the confidence limits by reference to the left- 
hand scale. 


When the values in the observed sample are not directly given in Table II, 
interpolation in the graphs will often be sufficient—see Examples 9 and 10. 
Example 9 

When confidence limits are required from a sample containing more than 
20 A’s (A being the less numerous class, as usual), Table II is to be employed 
as in Example 8, but in some cases interpolation is necessary. If N is 
intermediate between two values in Table II, but the percentage of A’s is 
directly given by the table, proceed as in the following example. 

N = 95; number of A’s = 38, i.e., 40%. In Table II find ‘Percentage of 
A’sin sample = 40’’. The required limits must be between those for N = 90 
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and those for NV = 100; and such an approximation is often sufficient. Fora 
more precise estimate, use Graphs 1, 2, or 3 (= Figs. 6, 7, of 8). Thus, for the 
extreme lower limit (P = 0.005) in Graph 1 run up the vertical line for 
N = 95 until it strikes the line of lower limits (continuous line) for 40%, and 
read off the required value, i.e., 27.5%. 


For most purposes interpolation in the table will probably be unnecessary, 
but, if it is desired, simple (linear) interpolation is sufficiently accurate. 
Thus, we take the limits for N = 95 about halfway between the tabulated 
values for NV = 90 and N = 100. For the extreme lower limit the difference 
between the tabulated values is 27.8—27.2 = 0.6. Half of 0.6 = 0.3. 
Therefore for N = 95 the limit is approximately 27.8 — 0.3 = 27.5%. 


If N were 92, the required limit would be taken as one-fifth of the distance 
from 27.2 to 27.8, and similarly for other intermediate values of N. (If the 


percentage in the sample is intermediate between two tabulated values, see 
Example 10.) 


Example 10 


N, i.e., total number in sample = 52; number of A’s in sample = 22, i.e., 
percentage of A’s in sample = 42.3. The confidence limits cannot be read 
directly from Table II, but for many purposes the graphs will be sufficient. 
Thus, to find the lower limit at the 3% level (P = 0.005) in Graph 1 (= 
Fig. 6), use the continuous (lower limit) lines for 40% and 45%. At the foot 
of the graph locate the position of N = 52 and run up to a point nearly half- 
way from the 40% line to the 45% line. Reading the level of this point on the 
left-hand side of the graph it appears to be above 25% but not quite 25.5%. 
Further, by noting that the sample percentage is situated at 2.3/5, i.e., 0.46, 
of the distance from 40 to 45, it was found possible, with the aid of a millimeter 
scale, to estimate the required limit as 25.3%. 


Alternatively, using Table II we note the following values for lower limits: 


» Percentage of A’s in sample 
45% 40% 

50 27.3 23.0 

55 28.1 23.7 


This is sufficient to show that the required value is between 23 and 28%, or 
approximately 25%. Such information may be enough in many problems; 
but for a more precise result we can proceed as follows by simple interpolation: 


The N in our sample, 52, lies at 2/5 of the distance from 50 to 55. There- 
fore, under 45% the required value should be at about 2/5 (or 4/10) of the 
distance from 27.3 to 28.1, ie., 0.4 of (28.1 — 27.3) = 0.40f 0.80 = 0.32, 
i.e., with only one decimal figure, 0.3. This, added to 27.3, gives 27.6 for 
the 45% column. 
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Similarly, for the 40% column the value is 23.0 + 0.4 of (23.7 — 23.0) = 
23.0 + 0.28 = 23.3 to one decimal place. 


Our sample percentage, 42.3, lies at 2.3/5 (i.e., 0.46) of the distance 
between the two values found above,—0. 46 of the distance from 23.3 to 27.6. 
It is therefore 23.3 + 0.46 of (27.6 — 23.3) = 23.3 + 2.00 = 25.3%, as 
was found by interpolation in the graph. 


Example 11 


Intranasal inoculation of 57 monkeys with poliomyelitis virus (8) gave the 
following results:— 
Arms first paralyzed: 25 animals, 
Legs first paralyzed: 27 animals, 
Arms and legs paralyzed at about the same time: 5 animals. 


Arm paralysis indicates involvement of cervical segments of the spinal cord; 
leg paralysis indicates involvement of lumbar segments. Is it justifiable to 
state that “‘the virus can, and does in more than half the cases, produce its 
first manifestations in the lumbar segments?” 


Method 


The third class (five animals) tells us nothing about unequal frequencies 
of involvement. Of the remaining 52 animals, 25 showed arm paralysis first, 
27 showed leg paralysis first. The number of A’s (25) is more than 20; 
therefore use Table II and Graphs 1 to 3 (= Figs. 6 to 8) first expressing 25/52 
as a percentage—48.08%; N = 52. 


If the true value of A (arm paralysis first) could be more than 50% without 
transgressing the limits that we usually allow for the effects of chance, the 
true value for not-A’s (leg paralysis) could likewise be Jess than 50%. 


Graph 2 (= Fig. 7) shows that the upper limit (P = 0.025) lies between 62 
and 63%. Graph 3 (= Fig. 8) shows that éven for P = 0.1 the upper limit 
is about 58%. 


In Table II the required upper confidence limit for A’s would be found 
between the values for sample percentages 50 and 45, with N between 50 and 
55. In this instance, however, the verdict is obvious because, even if the 
percentage of A’s in the sample were 45 instead of 48.08, and even if we had 
information from 55 animals instead of only 52, the population percentage 
could be as high as 58.9, i.e., the percentage of not-A’s (legs paralyzed first) 
could be as low as 100 — 58.9 = 41.1. 


Interpretation 


The sample does not show a significant majority of animals with leg 
paralysis earlier than arm paralysis; i.e., there is no adequate evidence of a 
tendency for legs to be paralyzed first. 


Even if we were content with a much lower degree of confidence (P = 0.1) 
we could not say that we believed that the true percentage of not-A’s (leg 
paralysis first) would be more than about (100 — 58) = 42%. 
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Comment 
An opposing hypothesis regarding the mode of spread or action of the 
virus might entail the earlier paralysis of the arms in a majority of animals, 
e.g., that the proportion of A’s should be 60%. The sample does not disprove 
that hypothesis because the upper limit (P = 0.025) is over 60%. 
Example 12 


Table II carries the percentage of A’s in the sample down to 0.1, i.e., 1 per 
1000, and Graphs 4, 5, and 6 (= Figs. 9, 10, and 11) cover the regions from 
5% to 0.1%. 

In these regions of the graphs, N becomes larger than is common in small- 
scale medical research, and less detail is given than in Graphs 1 to 3 (= Figs. 
6 to 8). Therefore, although the methods of Examples 8 to 11 can be used, 
greater precision will, sometimes be necessary, and sometimes very large 
samples (greater than 20,000) with percentages of less than 0.1 will be met. 
Table III, derived by interpolation in Table VII1 1 of Fisher and Yates (11), 
will help in such cases, and can be used for even higher percentages (up to 10%, 
with N greater than 200). For example: N = 460; number of A’s = 22. 
Therefore percentage of A’s = 4.783. Required: the upper confidence limit 
(P = 0.005). 

Method 

The procedure can be divided into four stages. 


Stage (1).—Express the percentage as a decimal fraction, i.e., the sample 
probability, p = 0.04783. Then g = 1 — p = 0.95217. 


Stage (2).—Estimate the standard deviation ~ Npg, which is easier to 
estimate when expressed as*V Aq, where A is the number of A’s in the sample, 
here 22. Four-figure logarithms are adequate. 


Log A = log 22 = 1.3424. Log q = log 0.95217 = 1.9787. 
Log Aq = 1.3424 + 1.9787 = 1.3211. Log~W/Ag = } log Ag = 0.6606. 
Do not convert from logarithms at this stage. 
Stage (3).—Use the general formula for any confidence limit: 
A + F/Ag + appropriate correction term from Table III. 
F is a factor depending on the level of the limit required: 
For P = 0.005, F = 2.5758; log F = 0.4109. 
For P = 0.025, F = 1.9600; log F = 0.2922. 
For P = 0.1, F = 1.2816; log F = 0.1077. 


The factors are obtained from the normal curve (Section C, Note 7). The 
values given here were derived from Table I of Fisher and Yates (11), and the 
logarithms from seven-figure tables. 


The addition (+) leads to an upper limit, the subtraction (—) to a lower 
limit. The correction terms are always added, never subtracted. 


4 
{ 
| 
| 
i 
i 
| 
i 
| | 
U 


26 CANADIAN JOURNAL OF RESEARCH. VOL. 26, SEC. E. 


To find, for instance, the upper limit (P = 0.005) for our sample: 
FVAq = 2.5758 Log F = 0.4109. Log = 0.6606 (see 
above). Therefore log Fx/Ag = 1.0715. Therefore F>~/ Aq = 11.79. 


From Table III the correction term for 4.783% (approximately 4.8%) = 
+ 2.84. 


Therefore A + F~/Aq + correction term = 22.00 + 11.79 + 2.84 = 
36.63. This is the limit in terms of number of A’s in a sample of 460. 


Stage (4).—Convert this absolute value into a percentage, i.e., 36.63 X 
100/460 = 7.96%. 


The reader will arrive at a very similar result by using Table II as in 
Example 10: 


Under percentage of A’s = 5, interpolate for N = 460 between 450 and 
500. Under percentage of A’s = 3, interpolate for N = 460 between 400 
and 500. Interpolate for 4.78 between 3 and 5%. 


Although Graphs 4, 5, and 6 (= Figs. 9, 10, and 11) are not specially 
designed to cater for samples of less than 1000, it will be seen that for the 
present example (V = 460; percentage in sample = 4.8 approximately) 
interpolation in Graph 4 between the lines for 5% and 3% indicates an upper 
limit of approximately 8%, as was found by calculation. 


1. ARGUMENT FROM SAMPLE TO POPULATION (continued) 


(2) SAMPLES WITH MORE THAN TWO CLASSES 
Example 13 


Among 50 individuals (persons or animals) receiving a certain inoculation, 
or infected by a certain organism, the reaction is as follows: A (mild), 13; 
B (moderate), 17; C (severe), 12; D (very severe), 8. Are we justified in 
believing that these discrepancies are due to something more than chance, i.e., 
are they anything more than one should expect in a random sample of a 
population in which the true ratio is 1:1:1:1? In other words, is there an 
equal probability (4 or 0.25) for each of the four classes? 


Method 


To use a method such as gives the information for two-class samples in 
Tables IA, IB, and II would be laborious, and we substitute a method that, 
with proper precautions, is sufficiently accurate—the chi square (x?) test. 
The rationale of this test is discussed in Section C, Note 9. Here we are 
concerned with the method of using it. 


The quantity chi square can be described in general terms as a measure of 
the discrepancy between an observation and a hypothesis. The greater the 
discrepancy, the larger is chi square, and, since the random sampling variations 
in chi square are well known, we can find in a table of chi square whether the 
discrepancy is greater than is usually attributed to chance. The hypothesis 
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in the present example is that the population has equal frequencies in the four 
classes, A, B, C, and D, i.e., that the ratio is 1 :1:1:1; but any other ratio, 
representing a hypothetical population composition, can be tested in the same 
way. For example, in a breeding experiment we know that, if a certain four 
characteristics are distributed among the progeny according to a certain 
Mendelian law, the animals should be distributed in four classes in the ratio 
9:3:3:1. There can, also, be any number of classes. 

On the hypothesis of equal frequencies we should expect in Class A one- 
quarter of the total (50), ie., 12.5 individuals. This is the ‘expected’ or 
‘hypothetical’ or ‘theoretical’ value (¢) corresponding to the ‘observed’ or 
‘actual’ value (a), i.e., 13. Now find the difference between a and ¢ (i.e., 0.5), 
square it, giving 0.25, and divide by #, to give 0.25/12.5, ie., 0.02. (Two 
decimal places are usually enough, and a slide rule or a four-figure logarithm 
table lightens the work.) 


Proceed in this way for all four classes: 


Class a t (a — (a-—?t)? (a —?#)?/t 
A 13 12.5 0.5 0.25 0.02 
B 17 12.5 4.5 20.25 1.62 
C 12 12.5 0.5 0.25 0.02 
D 8 3.35 4.5 20.25 1.62 
50 50.0 Chi square = 3.28 


(The ¢ values would, of course, differ in the different classes if the hypothe- 
tical ratio were not one of equality, e.g., if it were 9:3: 3:1.) 

Turn now to Table VII. Two features in the table require comment: 
(1) Degrees of freedom, (2) Probabilities. 


Degrees of Freedom 


When a chi square value is calculated, as in the present example, by 
testing the frequencies in a sample against an exact population ratio, known 
or hypothetical, the number of degrees of freedom is one less than the number 
of classes in the sample, e.g., four classes give three degrees of freedom. This 
rule, without explanation, is sufficient in practice and a full explanation would 
be long and complicated. 


Note, however, that with a total of four classes, when we have calculated 
the ¢ values for any three (even if the ¢ values are not equal, as they are in the 
present example) we know, from the total number of individuals, what the 
remaining ¢ value must be, i.e., it is not independent or ‘free’. 


Probabilities of Chi Square 
In Table VII, for three degrees of freedom, a chi square value of 7.815 


has a probability, P, of 0.05. Our value of chi square, 3.28, is far below this; 
therefore there is no significant difference between our sample and the hypo- 
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thesis (a ratio of 1:1:1:1). If the sample had given a chi square value 
greater than 7.815 (P less than 0.05) we should have said there was a signi- 
ficant difference, and if chi square had been greater than 11.345 (P = 0.01) 
we should have called the difference highly significant or very significant. 


In terms of the present example the information given by Table VII means 
that, if there were a population in which the ratio of the numbers in the four 
classes was 1:1:1:1, and if we took a large number of random samples of 
50 and calculated chi square for each sample, only 0.05 (i.e., 5%) of the chi 
square values would be more than 7.815 and only 0.01 (i.e., 1%) of the chi 
square values would be more than 11.345. More exactly, we should approach 
closer to these proportions by taking more and more samples. 


Comments 


(1). There may appear to be a discrepancy between the levels of signi- 
ficance (P = 0.05 and 0.01) adopted in the use of chi square tables and the 
half-values (0.025 and 0.005) used for Tables IA, IB, and II; but the standards 
are really the same (see Section A3 and Section C, Note 11). 


(2). Tests on samples containing more than two classes do not lead as 
directly to an estimate of confidence limits as with twofold classifications. 
When there are three or more classes the variety of possible population 
frequencies is enormous, and there is no one set of limits. We can, however, 
set up any hypothetical ratio and ascertain as above whether the sample 
agrees or disagrees with the hypothesis. 


Precautions in the Use of Chi Square in Argument from Sample to Population 


For reasons discussed in Section C, Note 10, certain precautions must be 
taken in the use of chi square, as in this example, to test sample frequencies: 


(1). If ¢ in one or more classes is less than 5, it is usually best to add those 
classes to neighboring classes. Thus, a five-class sample (four degrees of 
freedom) may have to become a three-class sample (two degrees of freedom), 
or even a two-class sample, suitable for testing by Tables IA, IB, or I]. (For 
the exception to this rule see (3) below.) 

(2). If ¢ in each class is five or more, and especially if it is 10 or more, one 
can confidently accept a verdict of ‘not significant’, ‘significant’, or ‘highly 
significant’, but it is uncertain how far one can trust the chi square probabilities 
in greater detail. 

(3). Even, when ¢ in one or more classes is less than 5 (down perhaps to 1) 
there need be little hesitation in accepting a verdict of ‘nonsignificance’ if the 
chi square value is well below the significant level, because in general the 
tendency for a low ¢ value is to make the chi square probability lower than the 
true probability, i.e., to heighten the suggestion of significance. 


Note.—Chi square can be used, as in this example, when there are only two 
classes in the sample (Section C, Note 11). 
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2. COMPARISON OF SAMPLES 


(1) TWO SAMPLES, EACH IN TWO CLASSES 
Example 14 
Twenty-four rats are subjected to a certain vapor that is liable to cause 
death. Two treatments, A and B, are applied, strictly at random, to equal 
numbers of animals. 


Died Survived Total 
Treatment A 4 8 12 
Treatment B 9 3 12 
Total 13 11 24 


Treatment B is already known to have some beneficial effect. Is there any 
evidence that Treatment A tends to be more successful? We do not know 
whether A may be better or worse than B or of the same value. Therefore 
we ask: Is there a significant difference between the frequencies of death in 
the two samples? 


Method 
For general principles see Section A4, and for further discussion see 
Section C, Note 12. 
Because the samples are equal and neither contains more than 20 individuals, 
use Table IV, rearranging the samples thus: 


Survived Died 
Sample (1)—Treatment B 3 9 
Sample (2)—Treatment A 8 4 


This rearrangement is necessary because, to avoid duplication in Table IV, 
the sample with the more unequal proportions is called Sample (1)—here 3 : 9 
rather than 8 : 4; and the smaller quantity in Sample (1) is always placed on 
the left. Note——These figures represent actual frequencies (numbers of 
individuals) although, for typographical convenience, they are separated by a 
colon, as are ratios. 

In Table IV find N = 12, and under that, in the column headed ‘‘Sample 
No. (1)” find 3:9. In the adjacent section of the column headed ‘‘Sample 
No. (2)”’ find 8 : 4, and read off the probability, 0.0498. Since this is greater 
than 0.025 the observed samples show no significant difference in frequency 
of death. 


Inter pretation 


There is not sufficient evidence to indicate which treatment, if either, is 
the better. 
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Comments 


(1). The misleading impression created by percentages can be illustrated 
by conversion of the data: 


Died Survived 
Treatment A 33.33% 66.67% 
Treatment B 75.00% 25.00% 


The difference, although 41.67%, has been shown to be nonsignificant. 
Other expressions are still more misleading, e.g., ‘Animals treated by B had 
two and a quarter times the death rate of the animals treated by A—an 
increase of 125%.” 

(2). Table IV was prepared because of the numerous small-scale experiments. 
in which equal samples are used, and because in small samples the chi square 
method, used in later examples, is often too inaccurate. Exact probabilities 
for all possible combinations are given because it may be desired to extract as 
much information as possible from small samples—see Comment (6). 

(3). Equality of samples is, of course, not necessary for the proper assess- 
ment of evidence, because all reliable tests give due weight to sample size, 
but equality is very desirable for other reasons (see Section A5). 


(4). If the original samples had been unequal, but had given results as follows: 


Died Survived Total 
Treatment A 4 9 13 
Treatment B 8 3 11 
Total 12 12 24 


we could still have used Table 1V because there would have emerged two equal 
samples (died and survived). 

(5). Inspection of the smallest samples in Table IV shows that, with equal 
samples, we require four individuals in each before we can demonstrate a 
significant difference (P less than 0.025) and even then the samples must 
have directly opposed composition (0:4 and 4:0). Similarly, for a highly 
significant difference (P less than 0.005) we need two samples of five. 

(6). In this example we have supposed that we did not know at the outset 
whether A or B might in reality be the more successful, if, indeed, there were 
any real difference. The same figures might, however, represent a different 
situation, because we might be quite sure that A could not be any less effective 
than B, or the sample labeled ‘‘Treatment B’’ might be untreated control 
animals, and we might be quite sure that Treatment A could do no harm. 
The: question would then be: Is the death rate of treated animals signi- 
ficantly lower than the death rate of controls? 
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This situation, the ‘one-sided comparison’, was mentioned in Section A4 
and is discussed in Section C, Note 5. In most instances the two types of 
situation are treated in these Examples as if they were the same. Thus, 
taking the original table of the present example, we have decided that there 
was no significant difference between the effects of A and B, because P was 
greater than 0.025. If the table had been in the form ‘treatment versus no 
treatment’, we should have proceeded in exactly the same way and concluded 
that there was no significant difference, and we should then have taken this 
as equivalent to saying that treatment was not significantly better than no 
treatment. If, however, the investigator desires to use other criteria with 
one-sided comparisons, e.g., P = 0.05 instead of 0.025, Table 1V provides 
him with the necessary probabilities for equal samples up to N = 20. 
Example 15 


In Section A4 the following table was given: 


Treatment Recoveries Deaths Total 
V 2 7 9 
W 5 1 6 
Total 7 . 15 


Is there evidence of a real difference between the effects of the two treatments? 
Method 


Neither sample contains more than 20 individuals, but the samples are 
not equal. Therefore use Table V. In the column headed “Larger Sample”’ 
find N, = 9 (the V-treated sample) and in the adjacent column look for 
N2 = 6 (the smaller sample, W-treated). If the frequencies in N; are 2 and 7, 
when the smaller sample (N2 = 6) contrasts as strongly as possible (6 and 0) 
the difference is significant (P = 0.0060, i.e., less than 0.025) but is not 
highly significant; and a smaller difference, e.g., 5 aid 1 as in our present 
problem, is not significant. 


Inter pretation 
There is no evidence that V and W are different in respect of the frequency 
of deaths. 
Comments on Table V 
(1). Note that, to save duplication, the table has been arranged so that 
the smaller frequency in JN, is on the left (2 : 7, not 7 : 2). 


(2). Following down N; = 9 and Nz = 6, note that frequencies of 3 : 6 and 
6 : 0 are significantly different, but there is no entry for the next stage, i.e., 
4:5 in Ni, because even when JN, differs as much as possible from this, the 
difference is not significant. 
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(3). To appreciate more fully the structure of Table V, observe some other 
entries. Thus, under N, = 15 and N2 = 4, the difference between the 
frequencies 2 : 13 and 4 : 0 is highly significant, but no other sample of four 
differs significantly from a sample of 15 when the frequencies in the latter are 
2 and 13. Again, under Nj = 15 and N, = 14, with frequencies of 7 and 8 
in Ni, 14 : 0 shows a highly significant difference. Keeping the 7 : 8 constant 
and changing the frequencies in N2 step by step we find that 13 : 1 shows a 
significant difference, but 12 : 2 does not, nor does 11 : 3, nor any of the other 
possible arrangements up to and including 2:12. A still greater change in 
the same direction, however, to 1 : 13, gives a significant difference, and the 
last move, to 0 : 14, takes us into the highly significant class. In terms of 
the present type of example: 


Treatment Recoveries Deaths Verdict 
V 7 8 A significantly higher recovery rate with W 
Ww 13 1 
V 7 8 No significant difference 
Ww 2 12 
V 7 8 A significantly higher recovery rate with V 
Ww 1 13 


Example 16 


In an investigation (13) of the in vitro sensitivity of Hemophilus influenzae 
to penicillin, 18 strains of the organism were isolated from cerebrospinal fluid 
and inoculated on plates of blood agar containing 0.5 units of penicillin per cc. 
of the medium. The cultures were classified according to degree of growth 
and according to the characteristics of the strain, smooth or rough. 


Growth 
Strain Total 
Heavy or moderate | Slight or nil 
Smooth 4 8 12 
Rough 5 | 6 
Total 9 9 18 


The strains appear to differ in amount of growth. Is the difference significant? 
Method 


The two unequal samples (smooth and rough) can be tested by Table V 
asin Example 15. Under N, = 12, Ne = 6, we find that, when the frequencies 
in N, are 4 and 8, the frequencies in N. must be at least 6 and 0 to show a 
significant difference, instead of 5 and 1 as here. Our sample shows no 
significant difference, i.c., P must be greater than 0.025. In this case, 
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however, we can find the exact probability by taking advantage of the equality 
of the two ‘growth’ samples (nine in each sample). Use Table IV as in 
Example 14. Under N = 9, for a frequency in M; of 1 : 8 and a frequency in 
N2 of 5:4, P = 0.0656. 


Example 17 


In a report on an investigation of traumatic shock it was stated that, after 
dogs were experimentally injured by a certain technique, if they were kept at 
an environmental temperature of 95° F., 73% died, whereas if they were 
“cooled to an equivalent degree” only 40% died. The report did not state 
the actual number of animals and therefore the percentages were useless, but 
inquiry revealed that 15 dogs had been heated and 10 cooled, with the follow- 
ing results: 


Died Survived Total 
Heated 11 4 15 
Cooled 4 6 10 
Total 15 10 25 


Is there any evidence that heating and cooling differ in their effects on 
frequency of survival? 


Method 


In Table V look for the larger sample under NV; = 15 and the smaller sample 
under Ne = 10. When the frequencies in N, are 4 and 11, as here, and the 
frequencies in N2 are 8 and 2, or 9 and 1 (or, of course, 10 and 0) the difference 
between the samples is significant, but if there is any greater similarity 
between the samples, e.g., 7 : 3 or 6:4 in No, the difference is not significant. 

Interpretation 


There is not sufficient proof of a real difference in effect between heating 
and cooling. So far as this sample shows, heating may tend to Jower the 
frequency of death. 


Example 18 


In 20 patients with meningitis (31) due to Hemophilus influenzae, five were 
treated with sulphonamides alone, and only one recovered. Fifteen had 
combined sulphonamide-penicillin treatment and eight of them recovered. 
Examine the statement: ‘The total number of cases is too small and the 
methods of treatment varied too much to allow of statistical assessment. 
The only conclusion to be drawn is that the encouraging recovery rate suggests 
pushing to the limit the combined sulphonamide-penicillin treatment. .... . * 
The first step is to see how far chance could account for the apparent difference 
in the effects of the two treatments. 
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Comments 


(1). The misleading impression created by percentages can be illustrated 
by conversion of the data: 


Died Survived 
Treatment A 33.33% 66.67% 
Treatment B 75.00% 25.00% 


The difference, although 41.67%, has been shown to be nonsignificant. 
Other expressions are still more misleading, e.g., ‘‘Animals treated by B had 
two and a quarter times the death rate of the animals treated by A—an 
increase of 125%.” 

(2). Table IV was prepared because of the numerous small-scale experiments. 
in which equal samples are used, and because in small samples the chi square 
method, used in later examples, is often too inaccurate. Exact probabilities 
for all possible combinations are given because it may be desired to extract as 
much information as possible from small samples—see Comment (6). 

(3). Equality of samples is, of course, not necessary for the proper assess- 
ment of evidence, because all reliable tests give due weight to sample size, 
but equality is very desirable for other reasons (see Section A5). 


(4). If the original samples had been unequal, but had given results as follows: 


Died Survived Total 
Treatment A 4 9 13 
Treatment B 8 3 11 
Total 12 12 24 


we could still have used Table IV because there would have emerged two equal 
samples (died and survived). 


(5). Inspection of the smallest samples in Table IV shows that, with equal 
samples, we require four individuals in each before we can demonstrate a 
significant difference (P less than 0.025) and even then the samples must 
have directly opposed composition (0:4 and 4:0). Similarly, for a highly 
significant difference (P less than 0.005) we need two samples of five. 

(6). In this example we have supposed that we did not know at the outset 
whether A or B might in reality be the more successful, if, indeed, there were 
any real difference. The same figures might, however, represent a different 
situation, because we might be quite sure that A could not be any less effective 
than B, or the sample labeled ‘‘Treatment B’’ might be untreated control 
animals, and we might be quite sure that Treatment A could do no harm. 
The question would then be: Is the death rate of treated animals signi- 
ficantly lower than the death rate of controls? 
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This situation, the ‘one-sided comparison’, was mentioned in Section A4 
and is discussed in Section C, Note 5. In most instances the two types of 
situation are treated in these Examples as if they were the same. Thus, 
taking the original table of the present example, we have decided that there 
was no significant difference between the effects of A and B, because P was 
greater than 0.025. If the table had been in the form ‘treatment versus no 
treatment’, we should have proceeded in exactly the same way and concluded 
that there was no significant difference, and we should then have taken this 
as equivalent to saying that treatment was not significantly better than no 
treatment. If, however, the investigator desires to use other criteria with 
one-sided comparisons, e.g., P = 0.05 instead of 0.025, Table 1V provides 
him with the necessary probabilities for equal samples up to N = 20. 


Example 15 
In Section A4 the following table was given: 


Treatment Recoveries Deaths Total 
V 2 7 9 
WwW 5 1 6 
Total 7 15 


Is there evidence of a real difference between the effects of the two treatments? 
Method 


Neither sample contains more than 20 individuals, but the samples are 
not equal. Therefore use Table V. In the column headed ‘‘Larger Sample”’ 
find N, = 9 (the V-treated sample) and in the adjacent column look for 
N2 = 6 (the smaller sample, W-treated). If the frequencies in N, are 2 and 7, 
when the smaller sample (N2 = 6) contrasts as strongly as possible (6 and 0) 
the difference is significant (P = 0.0060, i.e., less than 0.025) but is not 


highly significant; and a smaller difference, e.g., 5 and 1 as in our present. 


problem, is not significant. ° 
Interpretation 
There is no evidence that V and W are different in respect of the frequency 
of deaths. 
Comments on Table V 
(1). Note that, to save duplication, the table has been arranged so that 
the smaller frequency in JN, is on the left (2 : 7, not 7 : 2). 


(2). Following down N, = 9 and Nz = 6, note that frequencies of 3 : 6 and 
6 : 0 are significantly different, but there is no entry for the next stage, i.e., 
4:5 in N,, because even when N; differs as much as possible from this, the 
difference is not significant. 
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(3). To appreciate more fully the structure of Table V, observe some other 
entries. Thus, under N,; = 15 and N, = 4, the difference between the 
frequencies 2 : 13 and 4 : 0 is highly significant, but no other sample of four 
differs significantly from a sample of 15 when the frequencies in the latter are 
2 and 13. Again, under Ni = 15 and Nz = 14, with frequencies of 7 and 8 
in N,, 14 : 0 shows a highly significant difference. Keeping the 7 : 8 constant 
and changing the frequencies in N2 step by step we find that 13 : 1 shows a 
significant difference, but 12 : 2 does not, nor does 11 : 3, nor any of the other 
possible arrangements up to and including 2:12. A still greater change in 
the same direction, however, to 1 : 13, gives a significant difference, and the 
last move, to 0 : 14, takes us into the highly significant class. In terms of 
the present type of example: 


Treatment Recoveries Deaths Verdict 
V 7 8 A significantly higher recovery rate with W 
13 1 
V 7 8 No significant difference 
Ww 2 12 
4 7 8 A significantly higher recovery rate with V 
1 13 


Example 16 

In an investigation (13) of the in vitro sensitivity of Hemophilus influenzae 
to penicillin, 18 strains of the organism were isolated from cerebrospinal fluid 
and inoculated on plates of blood agar containing 0.5 units of penicillin per cc. 
of the medium. The cultures were classified according to degree of growth 
and according to the characteristics of the strain, smooth or rough. 


Growth 3 
Strain Total 
Heavy or moderate | Slight or nil 
Smooth 4 8 12 
Rough 5 1 6 
Total 9 9 18 


The strains appear to differ in amount of growth. Is the difference significant? 


Method 
The two unequal samples (smooth and rough) can be tested by Table V 
asin Example 15. Under N, = 12, Ne = 6, we find that, when the frequencies 
in N, are 4 and 8, the frequencies in Nz must be at least 6 and 0 to show a 
significant difference, instead of 5 and 1 as here. Our sample shows no 
significant difference, i.ec., P must be greater than 0.025. In this case, 
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however, we can find the exact probability by taking advantage of the equality 
of the two ‘growth’ samples (nine in each sample). Use Table IV as in 
Example 14. Under N = 9, for a frequency in NM, of 1 : 8 and a frequency in 
N, of 5:4, P = 0.0656. 


Example 17 


In a report on an investigation of traumatic shock it was stated that, after 
dogs were experimentally injured by a certain technique, if they were kept at 
an environmental temperature of 95° F., 73% died, whereas if they were 
“cooled to an equivalent degree’ only 40% died. The report did not state 
the actual number of animals and therefore the percentages were useless, but 
inquiry revealed that 15 dogs had been heated and 10 cooled, with the follow- 
ing results: 


Died Survived Total 
Heated 11 4 15 
Cooled 4 6 10 
Total 15 10 25 


Is there any evidence that heating and cooling differ in their effects on 
frequency of survival? 


Method 


In Table V look for the larger sample under NV, = 15 and the smaller sample 
under Ne = 10. When the frequencies in NM; are 4 and 11, as here, and the 
frequencies in Ne are 8 and 2, or 9 and 1 (or, of course, 10 and 0) the difference 
between the samples is significant, but if there is any greater similarity 
between the samples, e.g., 7 : 3 or 6:4 in Ne, the difference is not significant. 


Interpretation 


There is not sufficient proof of a real difference in effect between heating 
and cooling. So far as this sample shows, heating may tend to lower the © 
frequency of death. 


Example 18 


In 20 patients with meningitis (31) due to Hemophilus influenzae, five were 
treated with sulphonamides alone, and only one recovered. Fifteen had 
combined sulphonamide-penicillin treatment and eight of thei recovered. 
Examine the statement: ‘The total number of cases is too small and the 
methods of treatment varied too much to allow of statistical assessment. 
The only conclusion to be drawn is that the encouraging recovery rate suggests 
pushing to the limit the combined sulphonamide-penicillin treatment. .... . + 
The first step is to see how far chance could account for the apparent difference 
in the effects of the two treatments. 
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Method 
Arrange in a fourfold table: 


Deaths Recoveries Total 
Larger sample (sulphonamide—penicillin) 7 8 15 
Smaller sample (sulphonamide alone) + 1 5 


In Table V, under VN; = 15 and N2 = 5, with frequencies in NM, of 5 and 10, a 
sample of five with frequencies of 5 and 0 shows a significant difference; but 
no further entries are given, because, if the frequencies in N; are 6:9, 7 : 8, 
etc., a sample of five shows no significant difference even if its frequencies 
are 5:0. 


Interpretation 


Accepting the usual criteria of significance, we can assert that the samples 
show no reason why we should prefer the one treatment to the other. 

It may be suggested, however, that in treating such a serious disease we 
should not demand as high odds (as low P values) as usual. If so, we must 
assess the probabilities exactly, by the method of Section C, Note 13. This is 
equivalent to providing an item for Table V in the region where our observed 
samples would fit, and P, thus determined, is 0.2214. 

An investigator who is familiar with the principles of contingency tests 
(Section A4; Section C, Notes 12 and 13) will realize that this probability has 
the following implications: 

(1). If he accepts the observed samples as indicating superiority of the 
combined treatment, and if he would have accepted, on the same basis, samples 
that appeared to favor the treatment by sulphonamide alone, he is accepting 
standards that will lead him to proclaim a real difference between treatments 
in more than 44% of his investigations in which there is no real difference 
(twice 22.14%). 

(2). Even when he knows that one treatment (e.g., the combined treatment 
in this example) cannot be any worse than the other, if he accepts P = 0.2214 
as indicating real superiority of one treatment he will proclaim superiority in 
more than 22% of his investigations in which there is no superiority. 


Comments 


This example deserves careful consideration because it illustrates one of 
the most vital problems of clinical research. Referring to the statement 
quoted at the beginning we can say: 

(1). No sample is too small for statistical assessment. 

(2). It appears to be felt that, in addition to the distinction between the 
two basic treatments (sulphonamide and sulphonamide-penicillin), there were 
so many differences in methods of treatment that the two samples would not 
be comparable. They would have been comparable, however, if the two basic 
treatments had been allocated strictly at random. Some of the ancillary 
factors would tend toward recovery, others toward death, and some patients 
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would have a predominance of recovery factors, others a predominance of 
lethal factors. By strict randomization these two classes of patient would 
have had an equal chance of appearing in the sulphonamide group and in the 
sulphonamide-penicillin group. 

(3). Since the samples were not randomized they may have obscured some 
advantage or disadvantage in either of the treatments, and this would still be 
true even if our analysis had shown a significant superiority in one of them. 
As it is, our test has merely shown that there is not sufficient evidence to 
justify preference of either treatment. 

(4). The physician might say: “I am unwilling to experiment with my 
patients. If, because of some suggestive work by other observers, or on the 
principle that two antibacterial agents are likely to be better than one, I feel 
that it would be better to give the combined treatment, I consider it my duty 
to do so.’’ Therapeutically he would be correct, but he should then realize: 

(a). That his procedure would thereby cease to be an investigation. 

(6). That the “‘suggestive work by other observers” may have little 
sound evidence to support it—hence many of the changes in therapeutic 
fashions. If it were realized how often the apparent evidence in favor of some 
form of treatment could be explained by bad sampling or by the effects of 
chance, practitioners would probably not think that they were risking their 
patients by depriving them of such a treatment. 

(5). Certain compromises might be suggested: 


(a). To apply the sulphonamide treatment to the milder cases, the 
combined treatment to the more serious. But even if the combined treatment 
were better, unless it were very much better, one might easily emerge with a 
nonsignificant difference. Moreover, the experiment would probably break 
down when a mild case became more serious and the physician would feel 
impelled to apply the combined treatment. 


(b). To give the combined treatment, but vary the amounts of penicillin 
and sulphonamide in different groups. Here again the experiment would be 
successful only if the plan were strictly adhered to. 

(c). To increase the number treated by the combined treatment and 
compare the results with the five already treated by sulphonamide alone. 
But the sample of five and the sample treated by the combined method would 
not be strictly comparable because not random, and the same criticism applies 
to a comparison of data from one hospital (or physician) using sulphonamides 
alone with those of another using the comb:ned treatment. 


(6). Without experimentation no therapeutic advance can be made, and 
the practitioner is sometimes faced with the difficult question whether he 
shall apply a treatment that may entail some risk, or loss of some possible 
advantage, to a few patients in order to secure knowledge that may benefit 
a much larger number. One way of minimizing the risk is to plan the experi- 
ment well and to analyze the results in such a way as to extract the maximum 
amount of information from a small sample. 
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Example 19 
Treatments V and Ware tested on two equal samples—30 individuals in each. 
Treatment Deaths Recoveries Total 
V 14 16 30 
W 24 6 30 
Total 38 22 60 


Is there any evidence that the treatments differ in their effects? 


Method 


The common method of treating this would be by the method of Example 
20; but, for equal samples, Table VI gives some useful indications of minimum 
differences, expressed in percentage form. The table appears somewhat 
complicated because, for any pair of samples (1) and (2), there is not just one 
minimum significant difference between the percentages of A’s in the samples. 
It varies according to the actual percentage of A’s in Sample (1), and it varies 
according to whether there are more A’s or fewer A’s in Sample (1) than in 


Sample (2). 
To use the table, proceed as follows: 


(1). Find the percentages of the two classes in the samples: 


Deaths Recoveries Total 
Sample (1) V 46.67% 53.33% 100% 
Sample (2) W 80.00% 20.00% 100% 
Difference 33.33% 33.33% 


Interpretation 


(2). Take either sample as Sample (1) and from it determine which class is 
to be A—the class that contains not more than 50% in Sample (1). If we 
take V as Sample (1), deaths (46.67%) will be A’s, recoveries will be not-A’s. 


(3). Find N (here 30) in Table VI and, if, as here, there are more A’s in 
Sample (2) than in Sample (1) use the left half of the table. 


(4). The percentage of A’s (46.67) lies between the tabulated values 25 
and 50. To be highly significant the difference would have to be 36.67, 
instead of 33.33 asin our samples. To be significant it need only be 30.00%. 
The same conclusion is reached if we take the W’s as Sample (1), again using 
the left half of the table. (Note that the blanks in the right half indicate that 
maximum possible differences do not reach the specified level of significance.) 


Our samples show a significant difference in death rate—a significant 
association between treatment and outcome. 
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Comments 


(1). Refer to Example 18. In the present example, V corresponds to the 
sulphonamide-penicillin treatment, W to the sulphonamide treatment. The 
percentage frequencies of deaths are the same, but in the present example NV 
in each sample is 30. This shows that, if the proportions remained the same 
as in Example 18, we should require samples of almost 30 to prove a significant 
difference. 


(2). Note some relations between N and the minimum significant difference 
in Table VI. If the percentage of A’s in Sample (1) is zero, we can, at most 
levels in the table, halve the required difference by doubling N or we can 
reduce the difference to one-third by trebling N—a simple inverse relation. 
Where the percentage of A’s is 50 there is approximately an inverse square- 
root relation, e.g., by quadrupling N we only halve the required difference. 
Even where the percentage is not 50, we can form a useful estimate of required 
sample size by assuming the inverse square-root relation (Example 36). The 
estimates will tend to be too high, but this is safer than the opposite error. 


(3). For convenience of interpolation two columns from the left half of 
Table VI have been repeated in the right half. 
Example 20 

In an investigation (16) of the possible value of DDT as a preventive of 
scabies, soldiers with and without scabies were questioned regarding the use 


of DDT during the previous two months, either as a dusting powder or 
impregnated in their shirts. 


DDT No DDT Total 
Soldiers with scabies 29 23 52 
Soldiers without scabies 64 36 100 
Total 93 59 152 


Was there any evidence that DDT tended to prevent scabies? 


Since we should not expect soldiers who used DDT to have a higher incidence 
of scabies than those who did not, this is really a one-sided comparison (Section 
A4, and Section C, Note 5), but we shall as usual test it for the significance of 
the difference between the two samples, as if DDT and some other agent were 
- being compared and we had no knowledge which might be more effective. 


Choice of Method 


As the samples contain more than 20 individuals we cannot use Tables 
IV or V, and since they are unequal we cannot obtain any indication from 
Table VI. We therefore use chi square (x?) already described in Example 13 
as a measure of the discrepancy between an observation and a hypothesis. 
The hypothesis here is the same as when Tables IV to VI are applied, i.e., 
that the samples are from the same population—in the present problem, that 


38 CANADIAN JOURNAL OF RESEARCH. VOL. 26, SEC. E. 


there is no real difference in scabies incidence between those who use DDT 
and those who do not. In Example 13 chi square tested a sample against a 
hypothetical population ratio. Here we are using it with a contingency table 
and the arithmetical technique is different. In Section C, Note 14, the step- 
by-step technique is given, but a short method is easier. 


Short Method of Calculating Chi Square for Fourfold Tables 
Substituting letters for numbers in the fourfold table, we write: 


a b a+b 
c d c+d 
at+ec b+d a+b+c+d=N 


For convenience we speak of a, b, c, and d as occupying the four cells of 
the table, and refer to the four subtotals (marginal totals) and the grand total. 

We can now use the formula: 

Chi square (corrected for continuity—see below) = 

(ad ~ be — N/2P? 
(a + b) (¢ +d) @+c) (6+ 

where ~ means ‘difference between’, regardless of + or — sign. 
In words the procedure is: 


(1). Cross multiply the contents of the cells to obtain two products. 

(2). Find the difference between the products and reduce it by one-half 
the grand total. This reduction is the ‘‘correction for continuity’’ referred 
to in Section C, Note 10. Observe that it is a mumerical reduction of the 
difference between the products, regardless of the sign, positive or negative, 
of that difference. 

(3). Square the result (2) and multiply this square by the grand total. 

(4). Divide the quantity found in (3) by the product of the subtotals. 

Sufficiently precise results can be obtained by four-figure logarithms, often 
by a slide rule;- but it is desirable to do the first pair of multiplications on the 
upper line by longhand. 

Applying the formula to the present example, chi square (corrected for 
continuity), or x: on 

(29 XK 36 ~ 64 X 23 — 152/2)? XK 152 
52 X 100 XK 93 X 59 
_ (1044 ~ 1472 — 76)? X 152 
~ 52 X 100 X 93 X 59 
_ (428 — 76)? x 152 
~ 52 X 100 X 93 X 59 


2 
— __352? X 152 


5200 XK 93 X 59 
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Turn now to Table VII. As in Example 13, two features in the table 
require comment: (1) degrees of freedom, (2) probabilities. 


Degrees of Freedom 


When a chi square value is found, as in the present example, by comparing 
two or more samples in a contingency table the rule is: number of degrees of 
freedom = (number of rows of cells minus one) X (number of columns of 
cells minus one). Rows and columns of totals are not counted. For a four- 
fold table, therefore, degrees of freedom = (2 — 1) X (2—1) = 1. (For 
further remarks on the term ‘freedom’ see Section C, Note 14.) 


Probabilities of Chi Square 


In Table VII, for one degree of freedom a chi square of 3.841 has a 
probability, P, of 0.05. Our value of chi square (0.66) is very far below this. 


Inter pretation 


There is no significant difference between the incidence of scabies in the 
two samples, i.e., no evidence that DDT, under the conditions specified, 
tended to reduce the incidence of scabies. 


Comments 


(1). As in Example 13, with chi square we use P = 0.05 and P = 0.01 
as standards in assessing significance, instead of 0.025 and 0.005 used in 
Tables IV, V, and VI; but the standards are really the same (see Section C, 
Note 14), and 3P from chi square is an approximation to the P value that 
would be found by the exact method used to produce Tables IV, V, and VI. 

(2). Chi square for this example is worked out by the step-by-step method 
in Section C, Note 14, but at present it is desirable to know one step—multiply 
together the two lowest subtotals and divide by the grand total, to give the 
minimum expected or theoretical value, m. In this example m = 52 X 
59/152 = 20.2. The meaning of the procedure is seen when the step-by- 
step method is used. It is introduced here because it enters into the rules 
for chi square, now to be given. 


(3). Some investigators still use the standard deviation or standard error, © 


< Npgq, instead of chi square, for comparison of samples. This is not to be 
recommended (Section C, Note 15). 


Precautions in the Use of Chi Square to Compare Two Two-class Samples 

(Fourfold Contingency Tables ) 

Tables IV, V, and VI were prepared because the approximation given by 
chi square ceases to be sufficiently close as the samples become smaller, and 
especially as m (the minimum expected value—see Comment (2) above) 
becomes smaller. For samples not covered by the tables, i.e., larger than 20, 
chi square will seldom lead one astray if the following rules are observed. 
(For the basis of the rules see Section C, Note 20.) 


(1). When m is not greater than 1, find P by the exact method—see Section 
C, Note 13. 


. 
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(2). When m is greater than 1, use chi square (corrected for continuity) to 
test for significance, using P = 0.05 (chi square = 3.841) as the boundary 
between significance and nonsignificance, and P = 0.01 (chi square = 6.635) 
as the boundary between significance and high significance; but unless m is 
more than 10 do not assess the probability from Table VII any more closely. 
The only serious doubt arises in borderline cases: 

(a). If m is greater than 1 but not greater than 5, a chi square of 3.5 to 4.0 
is of doubtful significance. Employ the exact method or increase the size 
of one or both samples.. 

(b). If mis between 5 and 20 a chi square of 3.7 to 3.9 is doubtful, but more 
likely to indicate significance than nonsignificance. As m increases above 20 
the accuracy becomes greater. 

(c). If m is not over 10, do not accept a verdict ‘highly significant’ unless 
chi square is over 7. 

(3). With m over 10, and especially over 20, the P values of Table VII are 
reasonably safe, i.e., the }P intervals can usually be accepted as indicating 
what would be found by the exact method. If there is an error it will usually 
make 4P somewhat greater than it ought to be. 

Note.—Sometimes the correction for continuity reduces the quantity within 
the brackets of the chi square formula to zero or a negative quantity. This 
is because the correction tends to overcorrect. Calculation cannot continue 
but the verdict is ‘not significant’. 

Example 21 


In an investigation (7) of the mode of spread of poliomyelitis the virus was 
sought for in the feces. One series showed: 


No. of households|No. of households 
where a case of having contacts Total 
poliomyelitis with cases 
occurred outside the home 
Virus discovered in one or more members 

of household 6 8 14 
Virus not discovered 2 37 39 
Total a 8 45 53 


Is there a significant association between degree of contact with cases and 
frequency of presence of the virus? 
Method 
The samples are too large for Tables IV and V. See Precautions in 
Example 20, and find m, the minimum expectation. 
= 8 X 14/53 = 112/53—a little over 2. Therefore calculate chi square 
(corrected for continuity): 
(6 X 37 — 2 X 8 — 53/2)? X 53 
8 xX 45 X 79 K 14 
Since chi square is over 7, there is a highly significant difference between the 
samples. 


= 8.7. 
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Interpretation 


There is strong evidence of an association between degree of contact and 
frequency of presence of the virus. This might mean either (a) that the 
occurrence of a case of poliomyelitis tended to spread the virus among members 
of the household, or (6) that poliomyelitis tended to occur more frequently 
when there was a more frequent presence of the virus in the feces of members 
of a household. It is not the function of an association test to prove causal 
relations. 

Example 22 


From the following data (12) one observer concluded that there was an 
association between tuberculosis and the condition of the sternum. 


Synostosis of sternum | No synostosis Total 
Persons with tuberculosis 4 7 11 
Persons without tuberculosis 7 66 73 
Total 11 73 84 


Was the conclusion justified? 
Method 
The samples are too large for Tables IV and V. Use chi square (Example 
20). 
mis 11 X 11/84, i.e., between 1 and 2. 
Chi square (corrected for continuity) = 
(4 X 66 — 7X7 — 42)? XK 84 
11 X 73 X 73 X 11 
The precautions given in Example 20 show that this value, being between 
3.5 and 4.0, is inconclusive. The exact method (Section C, Note 13) gives 
P = 0.0336—not significant. 
Comment 
Even if the association had been highly significant, the result would be 
almost meaningless without careful sampling methods. Structural and 


= 3.899. 


functional peculiarities often run in families and differ in frequency in different | 


racial stocks, and so do tendencies to disease. Structural and functional 
changes occur with age and so do changes in the susceptibility to disease. 
Classification according to race and family, sex, and age are necessary. This 
would give a series of contingency tables, and the information, after each 
table was tested, might need to be combined—see Examples 30 and 31. 
Example 23 


An investigation (12) of 79 patients with diphtheria showed: 


Antitoxin administration Died Recovered Total 
Not until 6th day 15 20 35 
Before 6th day 3 41 44 
Total 18 61 79 
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Test the statement that recovery tends to be more frequent with the earlier 
administration. 

Method 


The samples are too large for Tables 1V and V. Use chi square (Example 
20). 


m is 18 X 35/79 = 8 approximately. 
Chi square (corrected for continuity) = 
(15 X 41 — 3 X 20 — 79/2)? 
18 X 61 X 35 X 44 
Chi square is far above 7, and therefore the difference is highly significant. 


Comment 


= 12.4. 


From other knowledge of diphtheria antitoxin we may readily accept it 
as the factor responsible for the very significant difference in frequency of 
recovery, but we should recognize that the data, as presented, do not justify 
this. The chi square test has shown that the difference is much greater than 
we should expect chance to account for; but the test does not rule out bias in 
sampling, nor would a mere increase, however great, in the size of sample. 
Delay in one form of treatment (here antitoxin) may be merely part of a delay 
in proper care by physicians and nurses, and sometimes there is the added 
burden of a journey to be undertaken by the patient before treatment can be 
started. 

Example 24 


The comments on Example 6 pointed out that, if there were real justification 
for pairing of animals, the data of that example might refer to 10 pairs of 
animals from the same litter, the Treatments V and W having been allocated 
at random within each pair. Let us suppose that the results were either 
recovery or death. Specifying them in more detail we can write: 


Pair No. Result Pair No. Result 
1 Both animals died 6 V recovered, W died 
2 W recovered, V died 7 W recovered, V died 
3 W recovered, V died 8 W recovered, V died 
4 Both animals recovered 9 Both animals died 
5 W recovered, V died 10 W recovered, V died 


Of the pairs that show any difference, six show W better than V and one shows 
V better than W. 


The hypothesis to be tested is that there is no detectable difference between 
the effects of V and W when applied to animals that are as closely alike as 
are animals in the same litter. Example 6 showed that there was no signi- 
ficant difference from a 50% population frequency, P for this sample of seven 
being 0.0625. 
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It may now be asked: 
fold contingency table? 


Cannot one take account of all 20 animals in a four- 


R D Total 
V 2 8 10 
WwW 7 3 10 
Total 9 11 20 


To test the significance of the difference in this table means to test the 
hypothesis that the two samples, V and W, are, in respect of the frequencies 
of recoveries and deaths, random samples from the same population; but if 
the original pairing was justified the two samples would tend to be more alike 
than random samples, and a difference in the effects of V and W might well 
fail to demonstrate itself. 

Table IV, under N = 10, shows that where frequencies in N; are 2 : 8 and 
in N2 are 7:3, P = 0.0349—no significant difference. If P had been less 
than 0.025 we should have accepted it as evidence of the difference between 
V and W, even if the test applied to the pairs had shown no significant differ- 
ence; but we should have concluded that the pairing of the animals had been 
unjustifiable. The contingency test should therefore be applied where the 
test of the results from pairs of animals has shown no significance; but this 
illustrates the dangers of artificial pairing. Unless there are very good reasons 
for pairing we run a serious risk of losing information that the same number of 
animals, unpaired, might have given. This information cannot be recovered 
by a subsequent contingency test, because, as already stated, we have 
artificially created samples that may be more alike than random samples. 
Example 25 


To explore the possibility that some forms of mental deficiency might be 


due to the Rh factor, mentally defective children were classified (28) as (a). 


mongols and those with well-established causes, e.g., cerebral palsy and birth 
trauma—the differentiated group, (b) those that could not be so differentiated. 
The mothers of these children were tested and showed: 


Group Rh negative Rh positive Total 
(a) 6 47 53 
(b) 14 42 56 
Total 20 89 109 


Is there a significant difference in the relative frequency of Rh negative 
mothers in the two groups? 
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Method 


The samples each contain more than 20 individuals, and m, the minimum 
expected value, is 20 X 53/109 = 9.7 approximately. Chi square (corrected 
for continuity) = 


(14 K 47 — 6 X 42 — 109/2)? x 109 

20 89 53 56 
which is well below 3.841, the value for P = 0.05. Since m is almost 10 we 
can accept the 3P intervals in Table VII as indicating the values of P that 


would be obtained by the exact method (Section C, Note 13) used to produce 
Tables IV, V, and VI. Here 3P is between 0.10 and 0.05, nearer the latter. 


Inter pretation 


= 2.55, 


There is no significant difference in frequency of Rh negative mothers in 
the two groups. 


Comment 


The observations that prompted this investigation were some clinical 
observations and some discoveries in autopsies. These suggested that Rh 
incompatibility, if it did not cause death or marked physical defects, might 
cause mental defect. Therefore the investigators might wish to continue this 
investigation despite the fact that their results so far had not reached the 
conventional standard of significance. It is desirable in all such cases, 
however, to assess the existing evidence more precisely, and P found by the 
exact method is 0.0542, which agrees, as it should, with the 3P value from 
chi square already obtained, but it enables the investigator to estimate his 
possible error more precisely, and to decide whether it is worth while pursuing 
the investigation further—see Section C, Note 13. 

Note.—Some investigators still compare samples by the standard error or 
standard deviation, ~/ Npgq, but this is not to be recommended—see Section C, 
Note 15. 


Example 26 


In the investigation used for Example 25, the following comparison was 
made. In the group classified as (6) in that example there were 14 Rh negative 
mothers out of 56, i.e., 25%. Observations by other workers were quoted to 
show that the proportion of Rh negative individuals in the general white 
population was approximately 15%. The question was asked: Does the 
sample of 56 with 14 Rh negative show a significant difference from the 
population value? The treatment of this question illustrates three common 
risks: 

(1). The risk of accepting other observers’ populations as equivalent to 
one’s own. 


(2). The risk of applying tests that give only approximate answers. 


(3). The risk of accepting as population values data that are really estimates 
from samples. 
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Other Observers’ Populations 


The 15% frequency of Rh negatives was stated to be derived from 
random samples of the white population; but in such cases ‘random’ commonly 
means ‘unselected’, i.e., obtained without purposive sampling from the people 
or material available in the location where the observer was at work. But so 
many structural and functional features in human beings are apt to differ in 
different parts of the same country, and even in different regions of the same 
city, that, unless such variations have been thoroughly investigated, there is 
always a risk of bias. With regard to the Rh factor, marked differences 
between whites and other racial stocks have been recorded (19), and, even 
without further information, one should recognize the possibility of differences 
within different stocks of the so-called ‘white’ race. To minimize the risk of 
such bias one must choose for comparison samples within the same region. 


The Risks of Approximations 


The difference between the sample value (25%) and the population value 
(15%) was tested by a method commonly used, the standard deviation (Section 
C, Note 8) and pronounced significant; but Table IB gives a more accurate 
verdict. Number of A’s in sample = 14; N = 56. The lower limit (for 
P = 0.025) is about halfway between 14.7% (N = 55) and 14.1% (N = 57), 
i.e., the sample is nearly, but not quite, significantly different from a population 
value of 15%. In many problems the distinction may be unimportant, but in 
assessing available evidence before setting up an elaborate investigation it is 
desirable to make the assessment as accurate as possible. 

The discrepancies introduced by methods of approximation become greater 
with smaller samples and lower percentages. Indeed it may often be desirable 
to obtain even greater precision than that of Tables I and II, by the binomial 
expansion, especially where one-sided comparisons are being made—for 
method see Section C, Note 18; for application to the present problem see 
Section C, Note 8—Example (2). 


The Risk of Accepting Estimates as True Population Values 
The paper from which the 15% frequency of Rh negative individuals was 


quoted showed that it was an estimate—the frequency found in a sample of © 


334 persons. In this particular problem some other samples had indicated a 
similar frequency; but often percentage frequencies are quoted, in textbooks 
or articles, without indication of sample size (which may be small) and then 
are used by other workers as if they were true population frequencies. For 
the effects of such methods on the present problem see Section C, Note 19. 


2. COMPARISON OF SAMPLES (continued) 
(2) MORE THAN TWO SAMPLES; MORE THAN TWO CLASSES 
Example 27 
One hundred and twenty-nine men had been found susceptible to motion 
sickness induced by a swing. To each of five groups of these men drug mixtures 
of the same general type were administered, the difference between the groups 
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being in the detailed composition of the mixtures and in the interval between 
the administration of the drug and the subsequent test in the swing. (Per- 
centages of men sick and not sick for each treatment are inserted in parentheses 
but are not to be used in the test.) 


Treatment | No. of men not sick No. of men sick Total 
V 13 (65.0%) 7 (35.0%) 20 
W 7 (43.7%) 9 (56.3%) 16 
» 4 24 (61.5%) 15 (38.5%) 39 
z 19 (67.8%) 9 (32.2%) 28 
Z 20 (76.9%) 6 (23.1%) 26 
Total 83 46 129 


What is the evidence that the treatments differ in their effects? 
Method 


The data form a 2 X 5 contingency table, and chi square, not corrected for 
continuity, is to be used. The hypothesis to be tested (Section C, Notes 12 
and 14) is that the five samples come from the same population with a ratio 
not-sick to sick as in the subtotals—83 : 46. On this hypothesis, estimate the 
hypothetical, expected, or theoretical value (¢) corresponding to each of the 
10 actual values (a) in the table. Thus, for V—not sick, a = 13,¢ = 20 X 
83/129 = 12.87. Similarly, multiplying in succession 16, 39, 28, and 26 by 
83/129, i.e., 0.6434, we find the remaining ¢ values for not-sick. For the sick 
we multiply the same totals by 46/129, i.e., 0.3566, and check the results by 
subtracting the ¢ (not sick) values from the respective totals. 


Not sick Sick 
Treatment 
a t a-t (a — t)?/t a t a-—t | (a—t?)*/t 
V 13, 12.87 0.13 0.00 7 7.13 0.13 0.00 
W 7 10.29 3.29 1.05 9 5.71 3.20 1.90 
X 24 25.09 1.09 0.05 15 13.91 we 0.09 
4 19 18.02 0.98 0.05 9 9.98 0.98 0.10 
Z 20 16.73 5.27 0.64 6 9.27 3.27 1.15 
Total 83 83.00 1.79 46 46.00 3.24 


(Note that the not-sick and sick must both be included in the calculation.) 
Chi square = 1.79 + 3.24 = 5.03. 


To enter Table VII, find the number of degrees of freedom for a 5 X 2 con- 
tingency table, i.e. (number of columns minus 1) X (number of rows minus 1) 
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= (2—1) X (5 — 1) = 4. Table VII shows that our value of chi 
square is far below the value, 9.488, required for significance (at P = 0.05). 


Interpretation 


There is no proof that the treatments differ from each other in their 
effects. Therefore the mere fact that Treatment Z seems to give the best 
results is no reason for confining further attention to that treatment. There 
is no adequate reason for supposing it to be better than any of the others, nor 


is there adequate reason for believing that Treatment W is worse than any of 
the others. 


Comments 


(1). An investigator, noting that Treatment W seems to differ greatly 
from Treatment Z in its effect, might separate these from the rest and analyze 
the data in a fourfold contingency table—7 :9/20:6. Even if he thereby 
found a ‘significant’ difference the result would be fallacious because the 
samples so selected are not random samples. They have been selected on 
account of their differences in frequency, and if one purposely selects samples 
from two widely separate parts of the same population (or frequency distri- 
bution) one will frequently find greater differences than we attribute to 
random sampling in tests of significance. 


(2). The method of calculating chi square shown above is used for 
contingency tables with any number of rows or columns. 


Precautions in the Use of Chi Square for Contingency Tables Larger than 
Fourfold 


As stated with reference to fourfold contingency tables in Example 20, chi 
square gives an approximation to the exact results that would be obtained by 
much more complicated methods. It will seldom lead one astray with tables 
larger than fourfold if the following precautions are taken: 


(1). Unless all ¢ values are 10 or over, do not accept the P from chi square 
as a close approximation to the true value. 


(2). If ¢ in one or more cells is between 5 and 10, accept the verdict as 
indicating merely significance or nonsignificance. 

(3). If t in one or more cells is less than 5, it is probably safe to accept a 
verdict of nonsignificance, because the tendency with low figures appears to 
be an exaggeration of the chi square, i.e., a lowering of P. For greater safety, 
or if chi square indicates significance, combine adjacent rows (or columns, or 
both) in such a way as to produce no ¢ that is less than 5—see Example 28. 


Example 28 


Patients with serpiginous ulcer of the cornea were treated with prontosil 
and the visual acuity of their affected eyes, after discharge from hospital, was 
compared with the visual acuity in patients who, previous to the introduction 
of prontosil, had been treated by other methods—argyrol or mercurochrome 
(22). The data form a 2 X 5 contingency table: 


= 


* 48 CANADIAN JOURNAL OF RESEARCH. VOL. 26, SEC. E. 


Number of patients 
Visual acuity Total 


Prontosil series | Previous series 
A 6/6 or better 6 1 7 
B_ 6/6 to 6/18 7 8 15 
C_ 6/18 to 6/60 5 3 8 
D Less than 6/60 2 6 8 
E Eye lost 0 2 2 
Total 20 20 40 


Was there a significant difference between the results in the two series? 
Since the series were not random samples (see Comments below), all that 


we can do is to find out how far chance could account for the differences in 
the results. 


Method 


Apply chi square as in Example 27. In the first row, in place of the 
actual value 6, the theoretical value (¢) is 7 K 20/40, i.e., 3.5, and similarly 
the other ¢ values are found: 


Prontosil series Previous series 


win 


Chi square = 8.14. Degrees of freedom = (2 — 1) X (5 — 1) = 4. 

P from Table VII is greater than 0.05, indicating a nonsignificant difference. 
However, ¢ in most of the cells is less than 5; therefore for safety we pool 
adjacent cells in the contingency table. Sometimes the pooling of only two 
rows (or columns) is sufficient, but here, looking at the ¢ values, we find it 
necessary to combine so many rows that the result is a fourfold table. 


The following four variants can be made: 


Visual acuity Prontosil series | Previous series Total 

(1) 

6/18 to 6/6 or better 13 9 22 

6/18 to zero 7 11 18 

Total 20 20 40 
(2) 

6/60 to 6/6 or better 18 12 30 

Less than 6/60 2 8 10 

Total 20 20 40 
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Visual acuity Prontosil series | Previous series Total 

(3) 

6/6 or better 6 1 7 

6/6 to zero 14 19 33 

Total 20 20 40 

Still present 20 18 38 

Eye lost 0 = 2 

Total 20 20 40 


If these tables had been larger than fourfold we should have tested them by 
chi square, and similarly if the samples had been larger than here. If chi 
square were to be used for these tables we should reject Variants (3) and (4) 
because they would contain ¢ values lower than 5, and Variant (1) would be 
preferable to Variant (2) because the latter would still contain ¢ values of 5 
in the lower line. (If none of the ¢ values in either variant were less than 10 
it would be legitimate to adopt the one, namely, (2), that brings out the 
greater contrast between the two series.) 

In the present case, with samples of 20, we can use Table IV to test all four 
tables. Under N = 20, we find P values for the four variants as follows: 


Table 
7313/11 :9 0.1703 
(2) 2:18/8:12 0.0324 
(3) 1:19/6:14 0.0457 
(4) 0:20/2:18 0.2436 


These results agree with the test applied to the original 2 X 5 table in showing 
no significance. 


Interpretation 


There is no adequate suggestion that the two forms of treatment differed 
in their effects. 


Comments 


(1). The pooling of two treatments, argyrol and mercurochrome, does 
not allow one to test prontosil against each of them. 

(2). The example illustrates a common method of comparing clinical treat- 
ments. A new treatment is applied to all patients and the results are com- 
pared with a treatment previously applied to all patients. This is not 
equivalent to taking random samples. The two series may differ more than 
would random samples in many factors that will, or may, influence recovery, 
e.g., age incidence, sex, strain of microorganisms, differences in adjuvant 
treatment and nursing, differences in severity in hospitalized patients depend- 
ing on availability of hospital space. Some of these, e.g., age and sex, can be 
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tested for, but for others we have to depend on such untested statements as: 
“There seemed to be no evidence that the severity of the disease differed in 
the two series.’”” That this is not academic hair-splitting has been shown 
when clinical observers, comparing two treatments, have agreed to adopt 
strict random sampling in the comparison of the ‘old’ and the ‘new’, and 
have found that, if they had judged the effect of the ‘old’ treatment by their 
previous records they would have come to the opposite conclusion from the 
one reached by random sampling methods. 


In the present example incorrect sampling may have obscured a real differ- 
ence, in favor of one or other treatment, and this would still be true even if 
the test had shown a significant difference between the observed samples. 


Example 29 
Data (12) on sex ratios of stillborn children: 


No. of males per 
Group No. of cases 100 females 
I Nine months and older 31114 126 
II Ninth month 8300 120 
III Eighth month 8395 114 
IV Seventh month 4967 125 
V_ Younger than sixth month 2056 125 


The data suggest a tendency towards a lower preponderance of boys at the 
eighth month than at other periods. Test the evidence. 


Method 


Sex ratios, as here, are, if anything, more misleading in appearance than 
are percentages, and they are not easily amenable to statistical tests. There- 
fore compute exact frequencies to the nearest whole number. For example, 
in Group I if there were a total (males and females) of 226 there would be 
126 males; therefore the number of males is (126/226) of 31114 = 17347. 

Arrange the results in a contingency table: 


Groyp Males Females Total 

I 17347 13767 31114 

II 4527 3773 8300 

Ill 4472 3923 8395 

IV 2759 2208 4967 

Vv 1142 914 2056 

Total 30247 24585 54832 


Calculate chi square as in Example 27. Formulae (Fisher (9), Section 21) 
reducesome of the labor in analysis of tables such as this, in which one division 


is into only two classes (here males and females), but they are designed for 
calculating machines. 
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Done step by step as in Example 27, on a calculating machine, the following 
values for (a — t)?/t were found—the 10 contributions to chi square: 


Males Females 

I 1.9629 2.4150 
Il 0.5802 0.7138 
III 5.4550 6.7113 
IV 0.1324 0.1630 
0.0543 0.0668 


Their total, chi square, is 18.2547. Calculation by four-figure logarithms 
resulted in some discrepancies, especially in both columns of Group I, where 
the numbers involved were largest, but the discrepancies cancelled each other, 
giving a chi square of 18.25. 

Enter Table VII with degrees of freedom = (5 — 1) X (2 —1) = 4. The 
chi square is beyond even the value for P = 0.01—a highly significant 
difference. 


Interpretation 
The sample strongly indicates that the population of stillbirths that it 
represents was not homogeneous in respect of male: female sex ratio through- 
out the months. The individual contributions to chi square show that the 
nonhomogeneity is mainly due to Group III—children at the eighth month. 


Comment 


The significant difference justifies a further search into the original data 
for possible causes. Before doing so one would apply further tests for homo- 
geneity. By removing Group III and placing the remaining groups ina 4 X 2 
contingency table one could again test by chi square. By fourfold tables one 
could compare Groups I and II, II and III, III and IV, IV. and V. 


3. COMBINATION OF INFORMATION FROM Two OR MorRE SAMPLES 


In order to provide information on a certain question an investigator often 
performs a number of separate experiments, sometimes with a somewhat 
different procedure, or different materials, in each experiment. Sometimes no 
single experiment provides a conclusive (statistically significant) result, and 
sometimes some of the experiments do so and others do not. All the experi- 
ments, however, may seem to point to the same conclusion, and the 
experimenter wishes to combine the information. 

The same type of experience is met by investigators who, owing to the 
nature of their material, must observe and analyze without experimenting. 
The question that confronts any such experimenter or observer is: Should he 
(1) combine the data, i.e., pool all his samples and apply a statistical test to 
the pooled data, or should he (2) combine the results, such as probabilities, 
previously obtained by testing each sample separately? If he adopts the second 
method another question arises: how should be combine the probabilities? 
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The common method of combining information is to pool the samples, 
but this, as will be shown, entails a risk of fallacious conclusions from 
heterogeneous data. 


Example 30—Combination of Data 


In two groups of dogs all the animals were subjected to a treatment that 
is liable to cause death. In each group half the animals, selected at random, 
were treated by methods that, it was thought, might prevent death; the 
others were used as untreated controls. The two treatments differed, but 
possessed a common factor. The results were: 


Died Survived 


Group A 
reated animals 12 
Control animals 12 


Total 24 


Grou 
Preated animals 11 
Control animals 11 


Total 22 


How strong is the evidence that treatment tended to reduce mortality, 
either in the individual groups or in the series as a whole? (Note.—Although 
the marginal totals for the columns (died and survived) happened to corre- 
spond numerically to sample sizes, this has no bearing on the problem.) 

The two groups were not formed by taking 46 dogs and allocating them at 
random to the groups. The experiments were independent in the sense that, 
after the observations in Group A were made, a second experiment, on Group 
B, was planned. The animals might, indeed, not all be dogs, but dogs (A) 
and cats (B), or if all were dogs, they might be females (A) and males (B), or 
they might be immature males (A) and mature males (B). The same method 


could apply if we had any number of groups and any number of classes within 
the groups. 


Method 


First find whether there is, within each individual group, significant 
evidence of the effect of treatment. 

Using Table IV because the samples are equal and do not contain more 
than 20 animals in each, we find— 

Group A: N = 12; P=0.1102; Group B: N=11; P = 0.0431. 
Taking P = 0.025 as the standard for significance, we decide that neither 
group shows a significant evidence of the effect of treatment.* The observed 

* Although this is a one-sided comparison (Section A4) we are, as usual, applying the test 
as for two-sided comparisons. Note that, even if we decided to accept Group B as showing signi- 


Scant evidence in favor of treatment, we ‘might wish to strengthen the evidence by combining the 
information as is done here. 
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difference in both groups, however, is a lower mortality among treated animals, 
and it is possible that by combining the information from both groups we 
might find convincing evidence that treatment had an effect. 
two treatment samples and pooling the two control samples we could obtain a 
fourfold contingency table—two samples with 23 animals in each; but before 
doing so we must see whether we are thereby pooling samples that are unlikely 
to be random samples from the same population in respect of mortality; i.e., 
we must test the two treatment samples for homogeneity and then test the two control 
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By pooling the 


Treated animals 


Died 


Survived 


8 
8 


12 
11 


Total 


16 


23 


should use Table V. 


There is obviously no significant difference, but if there were doubt we 
If the samples were too large for Tables IV or V, or if 
there were more than two classes or samples, we should use chi square. 
Similarly the control samples show no significant difference. 


Group 


Control animals 


Died 


Survived 


Total 


8 
8 


4 
3 


12 
11 


Total 


16 


7 


23 


We can therefore combine the two treatment samples as if they were 
samples from the same population, and we can similarly combine the two 
control samples, to give a fourfold table: 


Treated animals 
Control animals 


Total 


23 


Chi square (corrected for continuity) = 5.56. 


This is far above 3.841; therefore there is a significant difference in mortality 
between treated and control animals. 
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Interpretation 


There is good reason to believe that the treatrnent lowers the mortality. 
From the previous tests of homogeneity we have no reason to believe that 
Treatment A differs from B in this respect, or that the two groups of animals 
differ in response. 


Example 31—Combination of Probabilities and of Chi Square Values 


To show how the information has to be combined when samples are not 


homogeneous, we take the following data from an experiment like that of 
Example 30. 


Survived 


Group A 
Treated animals 12 
Control animals 10 


Total 


Group B 
Treated animals 
Control animals 


Total 


Method 


Again it appears that the treatment tends to reduce mortality, but in 
neither group, with P = 0.025 as the standard, is the difference significant 
(Table IV). Group A: N = 12; P = 0.2391. Group B: N = 11; P = 
0.0431. 

Before combining the information we test for homogeneity in two con- 
tingency tables: 


Total 


12 
B 11 


Control animals 
A 


12 
B 11 


With P = 0.025 as standard, Table V shows lack of homogeneity in the 
controls, although not in the treated animals. 


Treated animals: N, = 12, Ne = 11—not significant; 
Control animals: N; = 12, Ne = 11—significant (P = 0.0101). 
By pooling the two groups of controls we should be combining two samples 


that we have good reason to believe represent different populations. We 
must therefore combine the information in a way that will avoid fallacious 


3 

Died Total 

12 
12 
2 22 24 
8 3 11 

4 Died Survived 
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inference. Two methods are available: (1) summation of chi square values; 


(2) conversion of probabilities into chi square values and summation of those 
values. 


Summation of Chi Squares 


If we add together any number of chi square values the result is a chi 
square value, and its significance can be found in a chi square table such 
as Table VII. To find the degrees of freedom we merely add together the 
degrees of freedom of all the contributing chi square values. Since most 
contingency tests are performed by chi square we shall show this method first, 
although in this example it necessitates the calculation of chi square for the 
two contingency tables. 


Group Chi square Degrees of freedom 


A 0.5455 1 
B 2.9091 1 


3.4546 


Sum 


2 


Table VII shows that this is below the level of significance (P = 0.05) and 
Fisher’s table of chi square shows by interpolation that P = 0.178. (For 
method of interpolation see Fisher (9), Section 21.1.) 

Conversion of Probabilities into Chi Square Values 

This is very useful because it can be applied to any set of probabilities 
that have been obtained from independent tests of significance, e.g., in 
mensuration data. The method (Fisher (9), Section 21.1) is to take the 
natural (Napierian) logarithm of each probability, change its sign, and double 
it. This gives the corresponding chi square for two degrees of freedom. We 
then add together all the chi square values so obtained. 

The simplest way to find the natural logarithms is to multiply the common 
logarithms by 2.303, and the multiplication can be postponed until after the 
summation. 

Applying the method to the present example, we note that the probabilities 
obtained for Group A (0.2391) and Group B (0.0431) correspond to $P 
values for chi square (see Section C, Note 14); therefore it is convenient here 
to double them (0.4782 and 0.0862) because we are going to use a chi square 
table for testing significance. 


Group P Common logarithm of P Degrees of freedom 


A 0.4782 1.6796 = —0.3204 2 
B 0.0862 2.9355 = —1.0645 2 
Sum = —1.3849 


Sum of natural logarithms = —1.3849 K 2.303 = —3.1894. 
Sum with changed sign = 3.1894. 
Sum X 2 = 6.3788 = chi square for four degrees of freedom. 
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Table VII shows that this is below the level of significance. Fisher’s table of 
chi square shows by interpolation that P = 0.172, very close to the value 
found by summating chi square directly. 


Interpretation 


When allowance is made for the heterogeneity of the material examined, 
there is not sufficient evidence that either treatment lowers the mortality. 


Comments 


(1). If the P value obtained by these methods (summation of chi squares, 
directly or after conversion of probabilities) had been less than 0.05, and if 
Treatments A and B had been the same, we should have concluded that 
treatment tended to lower the mortality. We should not have known whether 
this was true for both groups considered individually, but until further 
evidence was obtained we should have been justified in assuming that it 
did—that, if immediate action were required, we could act on that assumption. 
If, however, the treatment had differed but possessed a common element, we 
should not have known how far the common element was responsible and how 
far each treatment had produced its effects by some feature appropriate to 
the group on which it was used. The result would, however, have justified 
further investigation. 

(2). For contrast with the above results, which showed no significant 
difference, the original samples were pooled, giving a treatment sample of 23 
and an equal control sample. The fourfold contingency table gave a chi 
square of 3.86, just beyond the significant value. Since the control samples 
have been shown to lack homogeneity, it is unsafe to accept this result. 

(3). When chi square values from fourfold tables are to be summated as in 
this example, it is stated (5) that they ought to be calculated without correction 
for continuity; but the question seems to require further investigation. In 
the present instance the corrected values were used and, as has been shown, 
the results agreed closely with the results of combining the exact probabilities. 
The risk in using the corrected values is that the probability of the final result 
will be higher than it should be, but this is generally a less serious fault than 
claiming significance where it does not exist. 


4. CONFIDENCE LIMITS FOR DIFFERENCES BETWEEN SAMPLES 


If the difference between two samples is not significant there may, neverthe- 
less, be a ‘real’ difference, i.e., a difference between the populations from which 
the samples were drawn. For instance, two treatments are often compared 
and pronounced ‘equally efficacious’ although all that has been proved is that 
the experiment showed no significant difference, i.e., no proof that they were 
not ‘equally efficacious’. Larger samples might give convincing evidence 
that there was a real difference, but before enlarging his samples the observer 
would naturally say: ‘Is the real difference, if such exists, unlikely to be 


= 
¢ 
: 
of, 
‘ 


MAINLAND: STATISTICAL METHODS—SEC. B, EX. 32 57 


more than, say, 5% greater success with Treatment A than with Treatment B? 
If not, it does not interest me.” 


Even if a difference between samples has been proved significant, a similar 
question arises, because the real difference may be either greater or less than 
the difference between the observed samples. 


Example 32 
The data of Example 20 are: 


Users of DDT Nonusers of DDT 


Soldiers with scabies 
Soldiers without scabies 


29 23 52 
64 36 


Total 


93 59 


Chi square = 0.66—no significant difference. 


The samples show, however, that of the 59 soldiers who had not used DDT, 
39.0% developed scabies, as against 31.2% of the 93 who had used it—a 
difference of 7.8%. If there is a real difference in scabies incidence between 
users of DDT and nonusers, what is likely to be the maximum limit of the 
difference? 


Method 


The same procedure can be followed whether the samples have shown a 
significant difference or not. In the present instance, where there was no 
significant difference, assume that DDT actually tends to lower the incidence 
of scabies. For the sample of 93 users of DDT (31.2% scabies) find the 
lower confidence limit at P = 0.025. Since the number of A’s (29 with 
scabies) is greater than 20, use Table II or Graph 2 (= Fig. 7). For N = 93, 
the required limit is approximately 22%. (For method of interpolation see 
Example 10.) 


For the sample of 59 nonusers of DDT (39.0% scabies) find likewise the 
upper confidence limit at P = 0.025—approximately 52.5%. The difference 
between 22 and 52.5 is 30.5%. 

To obtain wider limits and therefore higher confidence, we can use the 4% 
levels (P = 0.005). : 

For the rationale of the method and the reliability of the estimates, see 
Section C, Note 16. 


5. Sizes OF SAMPLES REQUIRED 


At the beginning of any investigation, or certainly in its early stages, it is 
desirable to estimate the number of observations (size of sample) that would 
probably be required to establish some particular result with a specified degree 
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of precision. Sometimes, indeed, if investigators had, before starting their 
projects, made such estimates, based on information already known, they 
would have seen that they could not, with the material, facilities, time, and 
money available, arrive at any conclusive result. 


Estimation of sample size can be considered in three types of problem: 
(1). Argument from a sample to a population—Examples 33 and 34. 


(2). Comparison of samples on the assumption that there is a difference 
between their populations—Example 35. 


(3). Comparison of samples on the assumption that they come from the 
same population—Example 36. 


Example 33 


A certain blood substitute, transfused into 80 patients, produced no unfavor- 
able reactions, but it was shown in Example 7, by use of Table IA, that in 
any population, represented by this sample, we should not trust the percentage 
of unfavorably reacting persons to be less than 6.4.- Assuming that we found 
no unfavorable reactions when we used the substitute on more and more 
people, how many people should we examine before we could conclude that 
the percentage of unfavorable reactions was very unlikely (P = 0.005) to be 
more than 1 in 500, i.e., 0.2%? 


Method 


Table IA, under ‘‘Number of A’s in sample = 0’, shows that, if NV = 1000, 
the upper limit (P = 0.005) is 0.53%. With these higher values of N the 
confidence limit is inversely proportional to N, e.g., at N = 1000 the value 
(0.53) is approximately one-half the value (1.1) at NV = 500. At N = 2000 
the value would be approximately one-half of 0.53, i.e., 0.265%. 

To find N for the limit 0.2%, divide the value for V = 1000 by V/1000, i.e., 
0.53 5 
N/1000 
Therefore 530/N = 0.2. Therefore N = 2650. 
Inter pretation 


If we continued to find no unfavorable reactions we should require to 
investigate about 2650 persons before being justified in stating that the 
percentage of unfavorable reactions was very unlikely to be more than 0.2% 
or 1 in 500. 


Example 34 


In Example 1, Variant (3), vaccination against a certain disease had pro- 
tected 15 rats out of 25 (60%), but it was shown that there was no adequate 
reason to suppose that it would protect more than 38.7% of such rats. 
Assuming that the investigator examined more rats and still found 60% © 
protected, how many would he require to examine before he could feel reason- 
ably confident (P = 0.025) that the protection rate was no less than 50%? 
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Method 


In Table II find ‘Percentage of A’s in sample = 40”, i.e., the percentage 
unprotected. Find a value of N such that, when 40% of that sample are 
unprotected, one can say that it is unlikely that the true percentage unpro- 
tected is greater than 50%. 


At N = 100 the upper limit (P = 0.025) is 50.2; at N = 110 it is 49.7. 
Therefore the required number of rats is between 100 and 110. For greater 
confidence (P = 0.005) one would require approximately 170 rats. 


Comment 


In an actual investigation, of course, the larger sample might show a 
higher or lower percentage than did the original sample, and the size of sample 
required would depend on that percentage. We could therefore assume 
various values for this percentage (subject to the proviso that the large sample 
did not differ significantly from the original sample), and use Table II to give 
estimates of necessary sample sizes. For most purposes, however, the one 
simple procedure, shown above, gives an adequate ‘most likely’ estimate. 


Example 35 
Example 20 presents the following data: 


Users of DDT Nonusers of DDT 


Soldiers with scabies 
Soldiers without scabies 


29 
64 


93 


23 
36 


59 


Total 


The incidence of scabies in users of DDT was 31.2%; in nonusers, 39.0%. 
Difference = 7.8%; but chi square (0.66) showed that this difference was 
not significant. If, however, the difference persisted when we examined more 
soldiers, we should reach a stage when we should pronounce it significant. 
How many soldiers should we require for this? 


Method 


Let it be supposed that we can investigate equal numbers of soldiers who 
have, and who have not, used DDT, and let N be the number required in each 
sample. In Table VI call the users of DDT Sample (1)—31.2% scabies; and 
call the nonusers Sample (2)—-39.0%. There are more A’s in Sample (2); 
therefore use the left half of the table. Locate 31.2% of A’s in Sample (1) 
between 25 and 50%, and for the lower degree of confidence (P = 0.025) use 
the lower part of the table. 


For N = 200 the minimum significant difference is between 9.50 and 
10.50—about 10%; for N = 500 it is between 5.80 and 6.40—about 6%. 
The observed difference is 7.8%. Therefore the required N is roughly half- 
way between 200 and 500—say 350, i.e., a total of 700 soldiers. 


59 
Total 
100 
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To obtain a closer approximation we can now construct a table, with 
proportions (scabies: no scabies) as in the original samples, but containing 350 
in each sample. 


Users of DDT Nonusers of DDT Total 
With scabies 109 137 246 
Without scabies 241 213 454 
Total 350 350 700 


Chi square (corrected for continuity) = 4.5 approximately. Therefore to 
provide a difference that is just significant (chi square = 3.841) we require 
fewer than 350 in each sample. If the samples are reduced to 300 each, chi 
square becomes approximately 3.5. Two samples with 320 in each give a 
chi square of approximately 3.9. 


Another, method commonly used for estimating sample sizes in such problems is based 
on the important fact that cht square is proportional to the sizes of samples, i.e., if we multiply 
the sample sizes and cell contents in a contingency table by any number, k, we multiply the 
value of chi square by k also. Thus, if we double the information we double the chi square 
value. Strictly, this relationship applies only to chi square values calculated without correction 
for continuity, but it is accurate enough for the first stage in estimates based on the corrected 
values of chi square. With such values it leads to an overestimate of the required sample size. 

In the present example chi square (corrected for continuity) for the samples of total 152 
is 0.66. For a significant difference we require a chi square of at least 3.841 (Table VII), 
i.e., nearly six times the original value. Trial with various factors, however, shows that it is 
sufficient to multiply the original samples by 4.4, giving a total of 670, in order to obtain a 
chi square (corrected) of approximately 3.8. Note that the total is greater than was required 
(640) when the samples were equal, because equal samples give more information than 
unequal samples of the same total size. 

Finally, it may be undesirable or impossible to add to one of the samples, and the other 
must be increased greatly to compensate. The simplest method is to estimate roughly the 
required total as if both samples were equal in size, then transfer the increase to one of the 
samples and test by chi square. If the increase is too small (or too great), increase (or decrease) 
the one sample by 50 or 100, and test again. In the present instance one could start by in- 
ag the DDT users from 93 to, say, 700, making the number with scabies 31.2% of 700, 
i.e., 218. 


Example 36 


In Example 17 there was shown to be no significant association between 
temperature and mortality in the following sample of dogs: 


Died Survived Total 
Heated 11 4 15 
Cooled 4 6 10 
Total 15 10 25 


By the method of Example 32, but using Table IB, it can be shown that the 
lower limit (at P = 0.025) for the survival rate in heated animals is 7.8%, 
while the corresponding upper limit in cooled animals is 87.8%—a difference 
of 80%. This is a very large possible difference, and the investigator might 
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say: ‘Even although the samples show no proof of a real difference, a real 
difference may nevertheless exist. I shall not be satisfied unless it can be 
shown that such a possible real difference is unlikely to be more than 10%. 
How many animals do I require to prove this, if there is in fact no difference?” 


Method 


To make a precise estimate of the number required is rather complicated, 
but usually only an approximate estimate is needed. Therefore we can use 
the fact that, for any fixed probability value, the estimated confidence limits 
of the difference between two samples varies, approximately, inversely as the 
square root of sample size. (An inverse square root relationship of similar 
kind is illustrated in Table VI—see Example 19.) 

For the observed samples (total: 25 animals) the difference between the 
confidence limits is 80%. If there is no real difference, i.e., if the samples 
belong to the same population, and we increase the sample size, we shall 
expect still to find no significant difference, and we shall narrow the estimate 
of the possible real difference. If we multiply the total number of animals 
by four we shall, approximately, halve the difference (division by +/4). 


Let N = the multiplier required to narrow the difference from 80 to 10%. 
Then 80/4/N = 10. Therefore ~/N = 8, and N = 64. The new total 
would therefore be 25 X 64 = 1600 animals. This is, however, a very 
rough estimate—see Section C, Note 17. 


6. MEASUREMENTS TREATED AS QUALITATIVE STATISTICS 


Wherever possible, it is better to make observations by measurements than 
by qualitative distinctions. Not only does measurement tend to be more 
objective, but it enables finer distinctions to be made during the observations 
and when the data are being tested—by comparison of means (two samples), 
by analysis of variance (two or more samples), or by methods of testing for 
regression, correlation, or covariance. Sometimes, however, it is desirable or 
necessary to treat mensuration data as if they were enumeration data, i.e., 
qualitative statistics. 


Example 37 


Example 28 presented measurements of visual acuity in a comparison of 
the effects of prontosil and other treatment; but the data were tested as if 
they were qualitative statistics. It might be asked why we did not use the 
actual measurements and compare the mean (average) visual acuity in the 
prontosil series with the mean visual acuity in the other series. There are 
two reasons why this is not advisable: 


(1). The measurements of visual acuity (6/6 or better, 6/6 to 6/18, 6/18 
to 6/60, less than 6/60, eye lost) are not sufficiently precise or uniform. 

(2). Even if more precise measurements were available and if the ‘eye lost’ 
group were eliminated, we should not know whether a frequency distribution 
of such measurements would be sufficiently like a normal curve to justify 
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applying the ordinary tests for mensuration data. Before applying such 
methods we have to know, either from previous experience of such data or 
from the sample under observation, that normal curve methods are 
appropriate. 


Example 38 
A certain drug, supposed to reduce the incidence of motion sickness, was 
administered to 104 men who were then tested in a swing. The tests were 


given to different groups at different intervals after administration of the 
drug. (We shall assume that the three groups were chosen at random.) 


Interval between drug No. of men No. of men Total 
and swing test not sick sick - 
One-half hour 41 10 51 
One hour 23 7 30 
Two hours 12 11 23 
Total 76 28 104 


Such a table could contain data from many different kinds of investigation ; 
e.g., the headings could be: “Interval between injury and surgical treat- 
ment—numbers of surgical successes and failures’, ‘‘Distance in a tissue from 
a source of irritation—numbers of migratory cells of Types A and B”, ‘Age 
groups of -cadavera (20-40, 40-60, 60-80 years)—presence and absence of 
duodenal diverticula”. The following remarks would apply to any of them. 

It will be seen that in each of the three groups the proportion of men not 
sick can be expressed as a percentage of the total in that group: one-half 
hour—80.4%; one hour—76.7%; two hours—52.2%. A graph could now 
be drawn, with time interval on the x-axis and percentage on the y-axis, and it 
would show an apparent downward trend of percentages as the interval 
increased. The common way of testing the reality (significance) of such an 
apparent trend is by a regression (or correlation) method, but with only three 
or four sets of values, unless the graph is very nearly a straight line, the trend 
cannot be proved significant. 


The more generally useful method is to test the apparent association 
between time interval and sickness in the 3 X 2 contingency table containing 
the original data (not percentages). Chi square = 6.69. Degrees of 
freedom = 2. P, from Table VII, is between 0.05 and 0.01. There is 
therefore a significant association between the incidence of sickness and the 
time interval. This is perhaps all that need be known in such a problem, but 
we should note how limited are the inferences from such association tests. 


Limitations of Association Tests 


All that the association test has shown is that there is a significant 
difference in the incidence of sickness among the three groups. Two things it 
cannot show: 


2 
| 
ie 
| 
i 
; 
| 
‘ 


MAINLAND: STATISTICAL METHODS—SEC. B, EX. 39 63 


(1). The type of the association. We should obtain the same chi square 
value if, without changing the body of the table, we interchanged the headings 
“one-half hour’ and ‘two hours”, although the type of association would be 
different—longer interval associated with lower incidence of sickness. It is 
by inspection of the table that we interpret the association as an increasing 
incidence of sickness with increasing interval; and even then we do not know 
whether the relation is to be expressed by a straight line or by a curved line 
as the data seem to suggest. 

(2). The degree of association. We do not know how closely the sickness 
incidence and the time interval are related, i.e., whether a short increase of 
time is associated with a slight increase in the incidence of sickness. From 
analysis of the 3 X 2 table we do not even know whether there is a significant 
difference between the one-half-hour group and the one-hour group, or between 
the one-hour and the two-hours group. (Actually, analysis by fourfold tables 
reveals no significant differences in either of these pairs.) 

The regression method, where applicable, not only establishes an association, 
but shows its type and degree. 


Example 39 


Enumeration tests may be useful in preliminary inspection of mensuration 
data. Fisher (9, Section 24) gives an example of this. Two supposedly 
soporific drugs, A and B, were tested on 10 patients, both drugs on each 
patient. The additional hours of sleep obtained under each drug were 
recorded, and it was found that in nine patients Drug B was apparently more 
effective; in one patient there was no measurable difference. The most 
appropriate method of analysis is to find the mean of the 10 differences and 
test its significance; but the data can also be treated as enumeration data. 
Of the nine patients who showed an apparent difference between the two drugs, 
all suggested that B was the more effective. If they were equally effective, 
the true (population) ratio—A more effective: B more effective—would be 
1:1. This ratio means a population value of 50% in each class, and Table IA 
shows that a sample of nine with zero A’s has an upper confidence limit of 
under 50% A’s, i.e., among those showing an appreciable difference, a signi- 
ficant majority obtained a greater increase of sleep after B than after A. 

A test appropriate to mensuration data (Fisher, 9, Section 24) showed that 
the mean of the 10 differences was significant, i.e., a significantly greater mean 
increase in hours of sleep. was obtained after Drug B. The technique of the 
test is not relevant to this article; but the precise nature of the two conclusions, 
italicized above, should be carefully noted and contrasted. Observe: 

(1). The conclusion from the enumeration test may be all that is necessary 
in practice. 

(2). The mensuration test shows not only that B is probably better than A, 
but how much better. 

(3). The enumeration test is less sensitive; e.g., as Table 1A shows, four 
patients would be insufficient to prove anything by the enumeration test, but 
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if the four differences varied very little from each other the mensuration test 
could prove the mean difference significant. 


(4). Because the two tests do not establish precisely the same conclusion, 
we should not expect the probabilities found by them to be exactly the same. 


Example 40 


Sometimes measurements, especially in the early stages of an investigation, 
do not lend themselves readily to analysis by simple methods suitable for 
mensuration data. The investigator may, indeed, not desire at any stage of 
his work the full information, e.g., formulae for curves, provided by elaborate 
methods of analysis, but even in the early stages he will wish to form an 
estimate of the probable frequency of certain phenomena before deciding to 
investigate the phenomena further, either by additional observations or by 
more elaborate statistical methods. 


An example of somewhat complex frequency distribution will suffice to 
illustrate a simple method of treating such problems. Fig. 1 represents bimodal 
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Fic. 1. Bimodal frequency distributions. The X axis represents attributes, e.g., gastric 
acidity (mensuration data) or classes of sample (enumeration data). The Y axis represents 
frequency, e.g., number of persons or number of samples. 
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frequency distributions, i.e., with two modes indicated by the two peaks. 
Such frequency distributions were found at an early stage in an investigation 
of gastric acidity (27), the measurements of total acidity being the ‘measured 
attributes’ in this case. Such bimodal curves suggest that there are really 
two populations, which, if separated, would form two simple bell-shaped 
distributions. 


A sample from a unimodal population may, however, show bimodality as a 
result of chance, but the test appropriate to determine this is rather com- 
plicated, and, unless the bimodality is pronounced or the sample is large, no 
evidence of significant bimodality is likely to be obtained. The investigator 
may continue to accumulate observations that can be pooled to form a large 
sample, but the type of investigation may not lend itself to this. For example, 
one may be measuring a small number of blood cells from each of a dozen 
individuals, or a small number of bacteria from each of 50 different colonies. 
If each sample represents a different population of individuals the samples 
should not be pooled; but the frequency of bimodal samples can be easily 
investigated. From Table IA and IB (Examples 5 and 6) we can conclude 
that there is a significant majority of bimodal samples if all of six samples are 
bimodal, or eight of nine samples, or 10 of 12 samples, and so on. If, on the 
other hand, there is no proof that bimodal samples would be likely to occur in 
more than 5% or 10% of a large population of samples, the investigator may 
decide that a search for a possible cause is not worth while. 


In the investigation of gastric acidity a sample was contributed by each 
sex—age group among the people surveyed, and a majority of the samples 
showed bimodality. Further examination of the data revealed that, when 
the records of persons with no free acid were separated from those with free 
and combined acid, the frequency distributions for total acidity were all 
unimodal. 

Section C—Notes 


Note 1—RANDOM SAMPLING VARIATION 


The basis of statistical tests is random sampling variation, i.e., the differ- 
ences between random samples of the same population. It is therefore 
important to appreciate, from simple examples, how the variation occurs and 
what variation to expect. 


A Random Sampling Demonstration 


The demonstration to be described here may seem rather artificial, but it 
gives a clearer conception of general principles than would an example from 
the laboratory or the clinic. When an investigator meets a difficult problem 
of sampling or of interpretation of statistical tests, he will be well advised to 
visualize it in terms of this sampling demonstration. Indeed, the actual 
technique of the demonstration is a very useful method of selecting samples in 
real investigation. 
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One thousand circular metal-rimmed cardboard disks, 1 in. in diameter, 
were used. They could be taken to represent human patients, animals, livers 
from autopsies, red blood cells in a minute (or diluted) drop of blood, or any 
other collection of individuals, animate or inanimate. The disks were lettered 
and numbered in various ways, as will be specified later, then placed in a box 
and mixed thoroughly by taking handfuls and allowing the disks to dribble 
through the fingers, in the same way as flour and other dry ingredients are 
mixed in breadmaking. After 8 or 10 such movements the disks were poured 
into another box and mixed again, and then, to obtain a sample of, say, 20 
individuals, a handful was taken and 20 disks were counted without inspection 
of their letters or numbers. The remainder of the handful was then returned 
to the box, as was the sample when its composition (letters and numbers) had 
been recorded. The thousand disks were then thoroughly mixed again before 
the next sample was taken. 


Samples of Two Individuals 


If there are equal numbers of disks marked M and F (males and females) 
in the population and if, taking a number of random samples, two disks 
per sample, we look at the disks one after the other, we expect, from our 
knowledge of chance, that the first disk looked at will be M in about half the 
samples and F in about half the samples. We expect also that, of all the 
samples in which the first disk is M, about half will have their second disk M 
also, while half will have their second disk F. Similarly, among the samples 
with first disk F, there will be approximately equal numbers with second 
disk M and second disk F. In diagram form: 


Second disk M (one-quarter of the samples) 


First disk (half are 


the samples) 


Second disk F (one-quarter of the samples) 


‘ Second disk M (one-quarter of the samples) 


First disk (half sf 


the samples) © 


Second disk F (one-quarter of the samples) 


The order (first or second) in which the disks are examined is, of course, 
immaterial, and the results can be expressed thus: 


Type of sample MM MF FF 
Relative frequency 3 
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We know, again from our experience of chance, that, if we take only a few 
samples, they may differ considerably from these proportions, but if we took 
more and more samples and they continued to differ we should become more 
and more suspicious, either that the original ratio of M to F was not 1 : 1, or 
that there was bias in the sampling. If there is no bias we expect the propor- 
tions to approach closer and closer to those deduced from our knowledge of 
the population. Some actual experiments showed the following: 


Relative frequencies 
Type of sample 
(two disks each) First First First 1000 samples 
10 samples 100 samples 500 samples ieee 
MM 0.30 0.26 , 0.21 0.23 
MF 0.40 0.44 0.51 0.51 
FF 0.30 0.30 0.28 0.26 


A Population of Samples 


Such a process of sampling gives us a population of samples, and we can 
imagine the process continuing indefinitely, to produce an infinite population, 
i.e., a population that can be made as large as one likes. As the process 
continued the relative frequencies would approach closer to the values 0.25, 
0.50, 0.25; and these can be called the ‘true’ values, or the values for the 
infinite population. For brevity one can speak of true (population) values, 
defined as values that would be approached closer and closer by continued 
random sampling. (The approach is, of course, not steady as in many mathe- 
matical series that approach a limit. It fluctuates as samples vary, but its 
main line continues.) 


Recalling the terms “probability”, ‘“‘odds’’, and ‘‘chances” from Section A, 
we see that the relative frequencies (0.25, 0.50, and 0.25) are obviously 
probabilities. They can be expressed also as percentuge frequencies: 25%, 
50%, 25%. Thus, the probability of obtaining a random sample of Type MM 
is 0.25, i.e., we should expect 25% of random samples to be of that type. 
The odds against finding an MM sample are 3 to 1, i.e., there is one chance in 
four of finding such a sample. 


Note 2—THE BINOMIAL EXPANSION 


Turning again to the population of 1000 disks (Note 1), we could use the 
same process of reasoning, based on our experience of chance, to work out the 
relative frequencies of M and F disks in any size of sample, and we could do 
likewise in experiments where there were 700 disks marked R (recoveries) and 
300 disks marked D (deaths), i.e., for probabilities of 0.7 and 0.3; and 
similarly for any other classes of marked disks, e.g., for probabilities of 0.997 
and 0.003. For samples containing more than three or four individuals, 
however, and for unequal probabilities, the step-by-step calculation from first 


. 
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principles is lengthy and laborious. Therefore we use a formula, the binomial 
expansion. This is not a mere approximation to the step-by-step method and 
it does not introduce any additional assumptions. Using the same kind of 
reasoning as in the step-by-step method, the mathematician can show that the 
binomial expansion gives the required results for any size of sample and for 
any pair of probabilities in a twofold classification such as we are discussing 
(note the derivation of ‘binomial’—di-, double; nomen, name). 


A Simple Binomial Expansion 
Medical investigators seldom need to use the binomial expansion directly, 
because tables derived from it (or approximating to it) are available; but they 


should know how it works in a simple case, e.g., the M and F disks in our 
demonstration. 


The formula is (p + g)* where— 
b = the probability of meeting an M disk; 
q = the probability of meeting a not-M disk, i.e., an F disk; 
N = the number in the sample, here 2. 
=P + 2pgt+?@. 


The superscript numerals indicate the composition of the samples, e.g., p? 
refers to samples with two individuals, each of whose probability is 9, i.e., 


TABLE I 
ee i. Probabilities from the | Percentage frequencies found experi- 
. Sep Percentage| binomial (0.7-+0.3)*° | mentally in different series of samples 
disks in each of Rin 
sample 
rR | D Actual | Converted | 10 | ist 100 | ist 500 | 1000 
into % 
0 20 0 0.000 0.0 
6 14 30 0.000 0.0 — — 
7 13 35 0.001 0.1 
8 12 40 0.004 0.4 0.2 
9 11 45 0.012 a2 _— 3 1.6 1.5 
10 10 - 50 0.031 3.1 10 4 2.8 $.1 
11 9 55 0.066 6.6 —_ 7 8.2 6.8 
12 8 60 0.114 11.4 10 13 14.0 12.3 
13 7 65 0.164 16.4 40 19 15.8 16.6 
14 6 70 0.192 19.2 10 23 20.8 20.6 
15 5 75 0.179 17.9 10 12 13.6 (+ | 
16 4 80 0.130 13.0 20 13 13.2 12.9 
17 3 85 0.072  & _ 5 6.6 6.8 
18 2 90 0.028 2.8 _— 1 2.4 2.8 
19 1 95 0.007 0.7 —_ —_— 1.0 0.7 
20 0 100 0.001 0.1 _ 
Total 1.001 100.1 100 100 100.0 | 100.0 


Note.—0.000 indicates a probability less than 0.0005. By using sufficient decimal figures 
we should find a value for each class, progressively diminishing toward the tatl of the Yaaribcion. 
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two M disks. The quantity pq, i.e., p'g', refers to samples with one M disk 
and one F disk; ¢* refers to samples with two F disks. 


In the present instance p = 3, g = 3. Substituting these values in the 
formula, we see how the appropriate probabilities are reached: 


A More Difficult Binomial Expansion 


There were 700 R disks (70%; probability = 0.7) and 300 D disks (30%; 
probability = 0.3). If we take random samples of 20 disks each, what are 
the probabilities of the various classes of sample? 


= 0.7;¢ = 0.3; N= 20. (p+ =[(0.7) + (0.3) 


The results of this expansion give us a population of samples (20 disks per 
sample) and are set out in Table I. For comparison the results of some actual 
disk sampling experiments are shown. Fig. 2 is a graph of the true prob- 
abilities converted to percentage frequencies. 


20- 


FREQUENCY, % 


T 


T 


6 8 10 2 - 14 16 18 20 


NUMBERS OF R IN SAMPLES OF 20 


Fic. 2. Graph of percentage frequencies from Table I, obtained by expansion of 
(0.7 + 0.3). 
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A table or graph that shows how many items, e.g., individuals or samples, 
occur in the various classes is called a table or graph of frequency distribution. 
In frequency distributions the value for the class that contains most items is 
called the mode—the ‘most fashionable’ value (French, /a mode, the fashion). 
It is indicated by the peak of frequency graphs. As would be expected, the 
mode in the present example is 70% R, the true percentage for the original 
population of 1000 disks. On each side of the mode the graph slopes down to 
form a tail, extending to the ends of the distribution. 


Note 3—ARGUMENT FROM SAMPLE TO POPULATION 


In medical research we meet, as a rule, not populations but samples, and, 
in one form or another, the questions arise: To what populations may this 
sample belong, and to what populations is it unlikely to belong? To answer 
those questions we can, as in Note 2, work out the random sampling distri- 
bution for any population, and then, if our sample would be found only rarely 
in a certain population, we can say: ‘“‘It is unlikely to belong to that popu- 
lation.’’ If the sample would be found often in certain other populations we 
can say: ‘It may well belong to one of those populations; we have not 
sufficient reason for believing that it does not.” 

Let us suppose that we did not know about the markings on the thousand 
disks, except that each was marked either R or D, and let us suppose that we 
obtained by random selection a sample of 20 with 9 marked R and 11 marked 
D. In Table I and Fig. 2 some classes of sample are relatively rare, i.e., of low 
probability. In the left-hand tail, samples of the class 9 R and 11 D, and the 
still rarer classes (8 R, 12 D, 7 R, 13 D; etc.) have a total probability of 
0.012 + 0.004 + 0.001 = 0.017; i-e., less than 0.025. In other words, this 
group of samples forms less than 2}% of the total samples. The capital letter 
‘P’ is commonly used to designate such probabilities, and, in cases like the 
present one, P, for any particular type of sample in a population, is the combined 
probability of finding, by random sampling, such samples and also all samples 
that are farther out in the same tail of the distribution. 

The question, it will be seen, is not: ‘What would be the probability of 
the particular type of sample?”” Instead, we ask: ‘‘Would the sample belong 
to a rare group (a group with low probability), i.e., would the sample be far 
out in a tail?’’ Since our sample of 9 R and 11 D (45% R) would be in a 
group that would rarely be found by random sampling from a population of 
70% R, it would obviously be safer to believe that the sample belongs to a 
population with a lower proportion of R than 70%; i.e., we feel justified in 
believing that, if we took more and more samples, we should approach some 
value lower than 70%. ‘Rarity’ in this connection is usually taken to mean a 
P value of less than 0.025—see Note 4. 

It may at first be somewhat difficult to understand why we consider not 
only the probability of the observed sample but also the probabilities of rarer 
samples. A full explanation is undesirable here, but one reason will be obvious 
if we visualize a distribution that slopes more slowly than in the present series, 
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i.e., has long tails with many types of sample, each type with a rather low 
probability. If we rejected each of these types because of its own probability 
we should, altogether, reject a considerable proportion of the total population. 


Note 4—SIGNIFICANCE AND NONSIGNIFICANCE 


As stated in Section A3, in a problem like that of Note 3, P = 0.025 is by 
convention taken as the boundary between significant and nonsignificant 
differences, and P = 0.005 is taken as the boundary between significant and 
highly significant differences. What this implies is seen from Table I (p. 68) 
and Figs. 2 and 3. In Note 3, assessing a sample with 9R out of 20 (45%) 


POPULATION A POPULATION B 
ji\ 
a 
zr | \ 
\ 
\ 
4 / \ 
/ \ 


IN B OBSERVED INA 
SAMPLE 


Fic. 3. To illustrate the conception of confidence limits. L.L. = lower limit. U.L. = 
upper limit. 

we cut off the tip of the left-hand tail containing not more than 23% of the 
samples. Likewise, if we were assessing a sample containing more than 70% 
R, we should cut off the right-hand tail containing, at the most, 2}%. Applying 
these standards throughout our work, we cut off 5% of samples (1 in 20), i.e., 
our level of significance, from both tails, is 5%, or P from both tails = 0.05, 
or the odds against finding by random sampling one of our ‘rare’ samples are 
19 to 1. 

When we set P = 0.005 as the boundary between significant differences 
and highly significant (or very significant) differences, we cut off 4% of the 
samples in each tail, i.e., a total of 1% of samples. Our level of significance, 
from both tails, is therefore 1%, or P from both tails = 0.01, or the odds 
against finding by random sampling one of our ‘very rare’ samples are 99 to 1. 

If the reader will consider these statements carefully he will avoid being 
confused by the apparent conflict between P = 0.025 and P = 0.05 as 
standards of significance and similarly by the apparent conflict between 
P = 0.005 and P = 0.01. 
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Note 5—REASONABLENESS OF SIGNIFICANCE CONVENTIONS 
The 5% and 1% Criteria of Significance 

These two standards of significance are, of course, arbitrary and conven- 
tional, and anyone is at liberty to set his own standards, but the conventions 
seem reasonable when their implications are noted. They apply both when 
we are testing a sample against a known or hypothetical population value, 
and when we are comparing samples with each other. 

First let us consider all of our investigations in which there is no real differ- 
ence, i.e., where chance alone is responsible for the difference between our 
sample and the particular population value that we are postulating, or between 
samples that we are comparing with each other. If we accept the 5% level 
of significance throughout we shall in 95% of these investigations correctly 
proclaim no significant difference. If these judgments are considered as 
‘diagnoses’, the standard of accuracy will probably be admitted as a reasonable 
one. 

In our other investigations, in which a real difference does exist, there will 
be many in which we shall, accepting the 5% level, diagnose correctly, but 
there will, of course, be some cases where, although a real difference exists, 
we shall proclaim a nonsignificant difference. If, however, we tried to reduce 
this error by accepting P = 0.1 as the criterion, we should run the risk of a 
10% error in all those investigations where there was no real difference. If, 
on the other hand, we waited until P was 0.01 or 0.001 before accepting a 
difference as significant at all, we should run the opposite risk of rejecting real 
differences in many investigations. 


As some of the examples in Section B indicate, an observer may feel justified 


‘in continuing an investigation when the samples already observed indicate 


odds of less than 19 to 1.__In other cases, he may demand odds of 100 to 1 or 
higher, e.g., when he wishes to feel confident that a certain drug will produce 
unfavorable reactions in not more than a certain small proportion of people. 
Note.—The only safe way to avoid being swayed by the results of an investi- 
gation, is to set one’s standard of significance at the outset. 
One-sided Comparisons 

Some special consideration should be given to judgments that can be called 
‘one-sided comparisons’, in which the observer is concerned with differences 
in one direction only, i.e., with only one tail of a distribution. 


For example, he might test a treatment for a certain disease by taking 20 
pairs of animals (litter mates), applying the treatment to one member of each 
pair and leaving the other member as an untreated control. If he knew that 
the treatment could not impede recovery but found that in more than 10 of 
his pairs the control animals recovered better than the treated animals, he 
would attribute the difference to random sampling variation, and would not 
apply a test of significance. He would, in fact, never run the risk of erroneously 
proclaiming as significant any differences in this direction; and he would 
therefore be justified in setting P = 0.05 and P = 0.01 as his criteria of 
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significance in the other direction, i.e., for samples in which treatment appeared 
better than no treatment, for by so doing he would insure that his erroneous 
judgments would not exceed 5% (or 1%) of his total judgments in this type of 
investigation. 

Similarly, when comparing mortality in samples of treated and untreated 
animals, if the investigator knew that the treatment could do no harm, he 
would attribute to random sampling variation ail cases in which treated 
animals had a higher mortality than untreated animals. This again is a 
‘one-sided’ comparison, in contrast to the ‘two-sided’ comparison in which 
the observer would wish to know whether treatment or no treatment was 
better, or which of two treatments was better (see Section A4). In the one- 
sided comparison, since he would make erroneous judgments of significance in 
the one direction only, he might decide to set as his standards P = 0.05 and 
P = 0.01, where P is the probability of finding samples that showed a 
mortality difference in favor of the treatment as great as in the observed samples, 
and greater. 

To have presented in our tables information on probabilities for one-sided 
as well as two-sided comparisons would have entailed doubling most of the 
tables; and therefore, in most instances, the two types of comparison are 
treated in our examples as if they were the same. The tables and other 
calculations give us tests of significance for two-sided comparisons and we 
apply the same verdict even when the comparison is in fact one-sided. We 
are therefore taking P = 0.025 and P = 0.005 from a single tail as the 
standard of significance in both cases, and we are thereby merely setting our 
standards somewhat higher for judgments in which differences in only one 
direction are of interest—a potential error of 23% (or 3%) instead of 5% 
(or 1%). For instances of one-sided comparisons see especially Examples 6 
and 14. 

Note 6—CONFIDENCE LIMITS 


The technique and rationale of finding confidence limits are given briefly in 
Section A3, and the process is represented pictorially in Fig. 3 (p. 71) by two 
frequency distributions, symmetrical for convenience in drawing. They are 
similar to Fig. 2, but the tips of the ordinates (vertical lines) have been joined 
and most of the ordinates themselves have been omitted. 


For the lower limit, the problem is to find a population A, such that our 
observed sample falls at the dividing line between the upper 23% of the samples 
and the rest of the samples. The true (population) percentage is indicated 
by the mode or peak of the frequency distribution (at 70% R in Fig. 2). 
This percentage is the required lower confidence limit at the 2}% level, i.e., 
for P = 0.025. 


Similarly, for the upper limit we have to find a population B, such that our 
observed sample falls at the dividing line between the lower 24% and the rest 
of the samples. The mode of this distribution will be the required upper 
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confidence limit at the 23% level, i.e, for P = 0.025. The space between the 
upper and lower limits is a confidence belt. 

For confidence limits at the 3% level, i.e., for P = 0.005, we should push 
the two populations, A and B, farther apart, i.e., enlarge the confidence belt. 


Ip any actual problem, confidence limits could be found directly from 
binomial expansions, by trying various values of p and g in the expression 
(p + q)*%, where N is the total number of individuals in the sample; but this 
is very laborious. 


A commonly employed method, although much simpler, is often misleading 
unless proper corrections are made. It is based on the normal curve (Note 7) 
and uses the standard deviation (Note 8). To provide more accurate esti- 
mates, Tables IA, IB, and IJ, and Graphs 1 to 6 (= Figs. 6 to 11) were prepared. 


Note 7—THE NORMAL FREQUENCY CURVE 


The normal curve is a widely used approximation to the binomial expansion, 
and when carefully employed it is very valuable. In this connection the term 
‘normal’ is used, not in contrast to ‘abnormal’, but in the sense of a ‘standard’ 
or measuring device (Latin, norma, a carpenter’s square). 


The development of the normal curve from the binomial expansion (p + q)* 
is pictorially presented in Fig. 4—frequency distributions with increasing size 
of sample (NV). Where » = g = 3, as in Note 2, the binomial expansion is 
symmetrical from the first, as is the normal curve, but even when # and gq are 
unequal the distributions approach closer and closer to the normal curve 
when NV is increased. 


When a mathematician establishes the formula for the normal curve he is, 
in essence, proving the fact that is suggested by Fig. 4. The formula itself is 
seldom used in the everyday application of statistical methods because prob- 
ability tables, derived from the formula, are available. 


Note 8—THE STANDARD DeviaTIon, ~/ Vpq 


In any distribution, either of measurements or enumeration data, the 
variation, i.e., the amount of spread of the distribution, is expressed by a 
quantity called the standard deviation, the word ‘deviation’ being equivalent 
to ‘variation’. (Sometimes the term standard error is used instead of ‘standard 
deviation’, the term ‘error’ being again equivalent to ‘variation’.) 

For distributions of measurements the standard deviation has to be 
estimated from the measurements themselves, but for a binomial distribution, 
as in Table I (Note 2), it can be found from the formula »/ Npg, where N = 
total number of individuals in the sample, p = probability of occurrence of 
one type of individual, A; g = probability of occurrence of the other type, 
not-A. 

In Table I, N = 20; p = the probability of recoveries (R) = 0.7; 
q = the probability of deaths (D) = 0.3. Standard deviation (S.d.) 
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=+/20 X 0.7 X 0.3 = V4.20 = 2.05 out of 20, i.e., 10.25%. The validity 
of the formula +~/ Npg can be proved mathematically, but the proof is unneces- 
sary for investigators. Its accuracy can be illustrated by treating any 


Y 


N=7 
(0.5+0.5)” 


N=50 
(0.5+0.5)°° 


N= 100 
(0.9 +0.1)'° 


Fic. 4. Genesis of normal curve from binomial expansion with increasing size of sample 
(N). The X axis represents class of sample as in Fig.2. The Y axis represents probabilities 
or percentage frequencies. 


binomial expansion such as that of Table I (Note 2) as a series of measure- 
ments and estimating the standard deviation by direct calculation. 

If percentages are used throughout, the formula ~/ Npg becomes 
/A (100 — A)/N, where A is the percentage of Class A in the population. 
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For Table I this becomes 7/70 X 30/20 = 10.25%. Therefore twice the 
standard deviation = 20.50%. : 

Now in a normal curve, if we measure from the center (mean) of the distri- 
bution, twice the standard deviation, we exclude in each tail rather less than 
24% of the total distribution. (1.96 S.d. cuts off almost exactly 23%.) 
Applying this to the binomial distribution in Table I (Note 2) we measure 
off from 70% 2 S.d., ie., 20.50, to give: 70 + 20.50 = 90.50% R; 
70 — 20.50 = 49.50% R. A sample with 45% R (i.e., 9 R and 11 D) is 
therefore beyond twice the standard deviation away from 70%, and if we met 
such a sample and knew nothing about its population, we could, by means of 
the standard deviation of a 70% population, decide that the sample percentage 
was significantly different from 70%. Two and one-half times the standard 
deviation (more exactly 2.57 S.d.), similarly used, excludes approximately 3% 
of the distribution in each tail; therefore it can be employed to determine 
highly significant differences. 


Used in this way, to test a sample against one particular population value 
the standard deviation is a simple and often sufficiently accurate method. 
To use it for the estimation of confidence limits requires trial of several different 
population values, and the final result contains an unknown degree of error. 
The standard deviation method, moreover, becomes very undependable: (1) 
as samples become smaller; (2) as distributions become more skew (asym- 
metrical), i.e., as p and g become more unequal; (3) when, as is often done, 
the standard deviation is calculated from the sample itself. Three examples 
may be given: 

Example (1) 


In a sample of 100 are eight color-blind men. Does this differ signi- 
ficantly from a population percentage of 4? 

(a). Standard deviation estimated from population value = V4 X 96/100 = 
V384/10 = 1.96%. Sample percentage minus population percentage = 4. 
4/S.d. = 4/1.96 = 2.04, which is large enough to indicate a significant 
difference. 

(b). Standard deviation estimated from the sample value = V8 X 92/100 = 
V736/10 = 2.713. Sample percentage minus population percentage = 4. 
4/2.713 = 1.5, showing no evidence of significance. The lower confidence 
limit (P = 0.025) estimated in this way would be 8 — 2S.d. = 8 — 5.426 = 
2.574%. 


(c). Table IB (number of A’s = 8; N = 100) shows that the lower con- 
fidence limit (P = 0.025) is 3.5%. There is no significant difference between 
the sample value and a population percentage of 4, but the proximity to 
significance is much greater than the standard deviation, estimated from the 
sample, would suggest, for the lower limit (P = 0.025) so estimated (2.574%) 
is lower than Table IB shows even for P = 0.005, viz., 2.6%. 
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Example (2) 

In Example 26 of Section B, among 56 mental defectives, 14 (25%) had 
mothers with Rh negative blood. Is this significantly different from a popula- 
tion percentage of 15? 

S.d. = V15 X 85/56 = V1275/56 = V22.77 = 4.77. 


Sample percentage minus population percentage, divided by S.d. = (25 — 
15)/4.77 = 2.10 — significant. 

Tables derived from the normal curve show that the probability of finding 
samples in one tail beyond 2.10 S.d. is 0.0179, indicating odds of 9821 to 179, 
or about 55 to 1. By contrast, Table IB (number of A’s = 14; N between 
55 and 57) shows the lower limit (P = 0.025) to be between 14.7 and 14.1%. 
The sample has not quite reached the level of significance, and this judgment 
is confirmed by calculating the exact P from the binomial expansion, i.e., 
0.0343, almost double the value derived from the standard deviation, and 
corresponding to odds of only about 28 to 1. 

This example shows that, for a given criterion of significance (here P = 
0.025 for one tail), the use of the standard deviation may give a verdict of 
significance, whereas the true verdict is one of nonsignificance. 

Note in this instance the value of finding the exact P. The investigator 
may justifiably claim that there is no reason to suppose that mental defectives 
might tend to have a lower proportion of mothers with RA negative blood than 
occurs in the general population, i.e., he is interested in a ‘one-sided’ com- 
parison (see Note 5) and would be satisfied with P = 0.05 from the one tail 
of the distribution—odds of 19 to 1 against finding by random sampling of the 
general population (15% Rh negative), samples containing 25% and more 
Rh negative persons. On that basis he would accept the result (P = 0.0343) 
as indicating a significantly higher percentage. 

Example (3) 

In a sample of 20 there are three A’s (15%). The standard deviation, 
estimated from the sample, is V15 X.85/20 = 7.98. The lower confidence 
limit (P = 0.025), estimated from the standard deviation = 15 — 2 X 7.98= 
minus 0.96, an obviously impossible value. The lower limit at P = 0.005, 
by use of 23 S.d., would appear to be 15 — 19.95 = minus 4.95%. Table IB 
shows that the limits are respectively 3.2 and 1.8%. 

These three examples have shown that the use of the standard deviation of 
the binomial distribution, without a correction term, can be very misleading; 
and there is no simple rule whereby the investigator can tell when not to trust 
the method. Hence the value of tables such as IA, IB, and II. 


Note 9—Cu1 SQUARE USED FOR TESTING A SAMPLE ° 
AGAINST A POPULATION VALUE 


In Example 13 there were 50 individuals with reactions after inoculation: 
mild, 13; moderate, 17; severe, 12; very severe, 8. The hypothesis to be 
tested was that the true (population) ratio was 1:1:1:1, indicating an 
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equal probability, } or 0.25 in each class. To test this we need to find whether 
the observed sample would occur rarely or commonly among random samples of 
50 from a population with the 1:1:1:1 ratio. The relative proportions of 
samples could be exactly ascertained by the expansion of (4 + 4 + 3 + 4)®, 
which is like the binomial but is called a ‘multinomial’ expansion because it 
contains more than two classes. This would be a long process, and we can 
usually obtain a sufficiently accurate result by means of chi square, as in 
Example 13. 

By following the rules of procedure one can safely apply the chi square test 
without probing more deeply into its nature; but those who wish to form 
some idea of what is being done in the test can perhaps be helped in the 
following way. 

First, let us imagine that we had a large population in which the numbers of 
individuals in the four classes, labeled as in our sample, were equal (a1:1:1:1 
ratio), that we took a large number of samples of 50 and arranged a graph of 
the frequency distribution. It would be more complicated than the distri- 
bution in Fig. 2 (Note 2) because there would be four classes instead of two; 
but for simplicity let us imagine it represented by such a graph. The samples 
that had ratios nearest to 1:1:1:1 would be most numerous and would 
form the hump of the graph; those that were most unlike that ratio would be 
much rarer and would lie in the tails of the graph. Now let us suppose that 
chi square values, with the ratio 1:1:1:1 as the hypothetical ratio, were 
estimated for each of the samples. The greater the discrepancy from this 
ratio, the larger would be chi square and the rarer would be its occurrence, i.e., 
the high chi square values would be in the tails of the distribution. 

Next, we can suppose that the actual frequency distribution was replaced 
by a normal curve (Note 7). From the equation for the normal curve we 
could indicate the frequency with which the various values of chi square 
occurred. In fact, although the process is not nearly so simple as this, it is 
from the normal curve that probability tables of chi square are derived. 

Note that P = 0.05 and 0.01 are used in assessing significance by chi 
square, not, as in Tables IA, IB, and II, P = 0.025 and 0.005. This will be 
best understodd from a simpler example in Note 11, and then applied to the 
more complicated problem discussed here. 

Finally, it should be observed that there is no need for separate tables of 
chi square related to size of sample or to different ratios (1:1:1:1; 
9:3:3:1; etc.), because these characteristics have already entered into, 
and have been allowed for by, the calculation of chi square itself. 


Note 10—LIMITATIONS OF THE CHI SQUARE TEST 


As stated in Note 9, probability tables for chi square are based on the 
normal curve, which is a continuous and symmetrical distribution. Therefore 
there are two factors that render probabilities derived from chi square inexact 


& 
a 
a 
Si 
j 
° 


- 


e 


MAINLAND: STATISTICAL METHODS—SEC. C, NOTE 11 79 


estimates of the true probabilities obtained from binomial or multinomial 
expansions: (1) Continuity, (2) Symmetry. 

Continuity.—If chi square is used for samples that contain only two classes, 
Yates’s simple correction for continuity can be applied (Notes 11 and 14)—see 
Fisher (9), Section 21.01. Where there are more than two classes, and when 
contingency tables are larger than fourfold, the correction is not applicable, 
but the error is less serious then. 


Symmetry.—As has been exemplified for binomial distributions, the graphs 
of exact probabilities are, except with 1 : 1 ratios, asymmetrical. Increase of 
sample size reduces the error of chi square from this factor, but the important 
feature is not total size of sample but the size of the theoretical (¢) values. 

These limitations dictate the precautions specified in Examples 13, 20, and 
27 of Section B. 


Note 11—Cu1 SQUARE USED witH Two-cLass SAMPLES 


Even when there are only two classes in a sample, chi square is often used 
in testing the sample against a population ratio, e.g., 1:1, or 3:1, or 5:2. 
For this purpose of course, Tables IA, IB, II, and III are not only simpler but 
more accurate than chi square. However, with samples large enough for both 
t values to be 10 or more, the chi square probability does not depart so widely 
from the exact (binomial) probability as to be misleading in tests of signi- 
ficance, and it is perhaps desirable to exemplify this use of chi square, especially 
because it involves the ‘correction for continuity’, which is met also in the 
application of chi square to contingency tables. 


Correction for Continuity 


In the calculation of chi square for two-class samples the only difference 
from the calculation for samples with more than two classes (see Example 13) 
is that each of the two (a — #) values is reduced by half a unit (0.5). This 
is called the ‘correction for continuity’, and the resulting chi square is called 
‘chi square corrected for continuity’ (x%). The correction makes the 
probability derived from chi square, i.e., based on the normal (continuous) 
curve, more like the exact probability that would be derived from the binomial 
distribution, a discontinuous series. Note.—The correction is a reduction in 
size, regardless of the sign, positive or negative, of (a — 2). 


Method of Applying the Chi Square Test 


A sample of 50 contains 32 X’s and 18 Y’s. Is it unlikely that the popula- 
tion value was 50%, a1:1 ratio? We proceed as follows: 


x Y 
a 32 18 
t 25 25 
(a~d 7 7 
(a~t) -—0.5 6.5 6.5 
(a ~t — 0.5)? 42.25 42.25 
(a ~t — 0.5)?/t 1.69 1.69 
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(The symbol ~ indicates ‘difference between’, regardless of sign, + or — .) 

Chi square (corrected for continuity) = 1.69 + 1.69 = 3.38. 

Note.—For a 1 : 1 population ratio, as here, x2 = 2 (a ~ t — 0.5)?/t. 
Degrees of freedom = 2—1= 1. Table VII shows that the lowest 

significant value (P = 0.05) for one degree of freedom is 3.841. Therefore 

our sample shows no significant difference from a population ratio of 1 : 1. 

Evaluated more precisely from the Appendix Table 4B of Yule and Kendall 


(30), or by interpolation in our Table VII (see Fisher (9), Section 21.1 for 
method), P = 0.0660. 


P and 3 P from Chi Square 
In assessing significance by chi square we commonly use P = 0.05 and 
0.01 as standards, not P = 0.025 and 0.005. To elucidate this, let us test 


the present sample by the binomial expansion (p + g)®°, where p = q = 3. 
The tail containing the observed sample runs: 


X Y Probability 
32 18 0.0160 
33 17 0.0087 
34 16 0.0044 
35 15 0.0020 
36 14 0.0008 
37 13 0.0003 
38 12 0.0001 
39 11 0.0000 
50 0 0.0000 
P = 0.0323 


The opposite tail of the same distribution contains samples as rare as those 
in the tail where the sample lies, and, because the distribution is symmetrical, 
the values are the same: X = 18, Y = 32—0.0160; X = 17, Y = 33—0.0087; 
etc., giving a total of 0.0323. Now chi square represents both tails; i.e., P 
from chi square is the probability of finding al] samples that are as rare as, 
and rarer than, the observed sample. It takes into account the samples that 
differ from the hypothesis in the same direction as the observed sample, and 
also those that differ in the opposite direction. To get the probability of the 
one tail in which the sample lies, we can take 3P. Thus, P = 0.0660, 3P = 
0.0330, which is not far from the exact value, 0.0323. 


Applied to the more complicated problem in Note 9, we can say that chi 
square includes (1) the samples, like the observed sample, with excess of 
‘mild’ and ‘moderate’ reactions, and (2) samples that have excess of ‘severe’ 
reactions. 
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Note 12—THE GENERAL PROCEDURE IN CONTINGENCY TESTS 


The general procedure in contingency tests can be illustrated from the data 
of Example 15: 


R D Total 
Treatment V 2 7 9 
Treatment W 5 1 6 
Total 7 8 15 


Whatever particular method is used—the exact (factorial) method (Note 13) 
or chi square (Note 14)—the procedure can be outlined as follows: 


(1). Assume that the two samples are random samples from the same 
population.* 

(2). Find from the available information, i.e., the two samples, the best 
estimate of the composition of the assumed population. From the above 
table the best estimate of the population ratio R : D would be 7:8. (This 
procedure is comparable to the ordinary practice of getting the best estimate 
from two or more measurements by adding them together and finding their 
average. ) 


(3). Calculate how often pairs of samples of the same size as the observed 
samples and showing the same and greater differences would be met in 
random sampling of the one population. More precisely, calculate P, the 
probability of finding, by random sampling, pairs of samples that are as rare 
as, and rarer than, the observed samples. 

(4). If P is low, indicating that such pairs of samples would rarely be 
obtained from the same population, accept it as a proof that the populations 
are different, and call the difference between the samples ‘significant’, or if P 
is very low call the difference ‘highly significant’. If P is not low, call the 
difference ‘not significant’, indicating that there is not sufficient evidence to 
show that the samples have come from different populations. (For conven- 
tional standards of significance, see Section A4 and Notes 13 and 14.) 


Alternatively, of course, the verdicts can be expressed in terms of ‘homo- 
geneity’ and ‘association’ (Section A4). 


Dangers in Interpretation of Contingency Tests 


It is important to note the wording of the verdicts just given. For example, 

a nonsignificant difference is not a proof that the samples have come from the 
same population. Particular care should be taken with the word ‘association’, 
which does not imply causation. We may have concluded that a significant 
association is present, but we have not thereby proved, or even suggested, a 
* Strictly, we are dealing with the same population of samples in the exact (factorial) test, and 


with the same population of individuals in the: chi square test; but this distinction is not important 
for an elementary grasp of the general procedure. 
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cause-and-effect relation. If there is an association between X and Y, X may 
be the cause of Y or vice versa, or both may be due to some common cause, or 
the relation may be more indirect and complex. 


When time is a factor, associations can be very misleading. For example, 
a survey of the past forty years would doubtless show an increase in (a) 
consumption of cigarettes, (b) reported mortality from coronary artery 
disease, and (c) travel by air; and an association between these three could 
therefore doubtless be established. We should not think of attributing (c) to 
(a), or (a) to (c), and the mere proof of association between (a) and (8) gives 
no more reason for suspecting a causal relation between those two phenomena, 
however plausible such a relation may seem. 


Contingency tests, like all other tests of significance, show only how far 
chance could account for the results. The interpretation is left to the observer. 


Note 13—THeE Exact METHOD FOR CONTINGENCY TESTS 


The exact method (Fisher (9), Section 21.02) looks complicated and 
laborious, but can be easily carried out by anyone who can use logarithms. 
It is the method that (1) provides exact probabilities, to which the prob- 
abilities derived from chi square are an approximation; (2) tests the accuracy 


of the chi square method; and (3) was used for the construction of Tables IV, 
V, and VI. 


The exact method starts with a display of fourfold tables showing all the 
possible pairs of samples that could be found under the imposed conditions 
(see Note 12), namely, that the sizes of samples be the same as in the observed 
samples, and that the composition of the population (the population ratio), 
estimated from the observed samples, remain constant. These conditions are 
realized by keeping the subtotals in all the tables the same as in the original 
table. 


The random sampling probabilities of the various possible pairs of samples 
are then found from a formula; but it should be noted that this formula is 
based on experience, just as is the use of the simple binomial expansion 
(p + gq)? discussed in Note 2. Like the binomial expansion, the formula 
involves the uSe of factorials (see Note 22 and use Table VIII). 


The data of Note 12 are: 


R D Total 
Treatment V 2 7 9 
Treatment W 5 1 6 
Total 7 8 15 


The compartments in the body of the table, occupied by 2, 7, 5, and 1, are 
cells. The subtotals, 7, 8, 6, and 9 are marginal totals; 15 is the grand total. 


J 
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The main procedure will be summarized later, but the detail of the following 
seven stages should be noted first: 


(1). With the marginal totals and grand total constant, display all the 
possible arrangements of numbers in the cells. The simplest way is to begin 
by making all the possible changes in the row or column that contains the 
smallest marginal total (here the second row), running up and down from the 
observed samples. Then change the other cells correspondingly. For 
reference, we letter the sets of pairs. 


Possible pairs of samples Possible pairs of samples 
(a) 1|8 (e) 5|4 
6|0 2\4 
(b) 2:17 (f) 6|3 
—|—— (observed samples) 
(c) 316 (g) 
4/2 0/6 
(d) 4|5 
313 


(2). Add together the logarithms of the factorials of the four marginal 
totals and subtract the logarithm of the factorial of the grand total. 


Log factorial 


7 3.7024 
8 4.6055 
9 5.5598 
6 2.8573 


16.7250 
Grand total 15 —12.1165 


4.6085 


Marginal totals 


The result can be called the ‘marginal value’. 


(3). Take the first of the possible pairs of samples. Find the logarithm of 
the factorial of each of the numbers in its four cells and add these logarithms 


together: Log factorial 6 2.8573 
8 4.6055 
7.4628 


(The factorial of 1 is 1, and so is the factorial of zero. Therefore the logarithm 
is zero.) ; 


Do likewise for each of the other possible pairs, and either enter each of the 
sums near its own pair of samples or mark them to show from which samples 
they are derived: 


Sums of logs of factorials Sums of logs of factorials 
(a) 7.4628 (e) 5.1406 
(b) 6.0826 (f) 5.7147 
(c) 5.3167 (g) 6.8607 


(d) 5.0158 
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(4). From the ‘marginal value’ found in (2) subtract the first of the seven 


quantities found in Stage (3): 4.6085 — 7.4628 = 3.1457. Do likewise for 
the other six quantities in Stage (3), to give: 


(a) 3.1457 (e) 1.4679 
(6) 2.5259 (f) 2.8938 
(c) 1.2918 (eg) 3.7478 
(d) 1.5927 


(5). Convert to antilogarithms the quantities found in Stage (4). These 
antilogarithms are probabilities. Record each of them alongside its own pair 
of samples: 


(a) 0.0014 


(b) 0.0336 (observed samples) 
(c) 0.1958 
0.3914 


0.2937 


0.0783 


0.0056 
0.9998 (total) 


1 
6 
2 
5 
3 
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7 
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(6). Since all possible types of sample are displayed, the sum of the prob- 
abilities should be 1.0000. Test the whole calculation by adding them 
together. As the fourth decimal figure in each item is only an approximation, 
the sum often has an error, excess or defect, in the fourth place, as in this 
example. Values of 0.9997 or 1.0003 can be passed. Occasionally greater 
errors are due to the same cause, but before they are accepted it is advisable 
to repeat the’ whole calculation, sometimes with logarithms that contain 
seven decimal figures. 


(7). P is the probability of the observed pair of samples plus the probability 
of rarer samples in the same tail. Therefore P = 0.0336 + 0.0014 = 0.0350. 
Summary of Main Steps 

It will be seen that the main part of the calculation can be summarized thus: 


17! 819! 
From the totals find one , and multiply it in turn by each of the 


quantities derived from the cells— 
1 1 


1 


: 

0 
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Partial Computation 

In the problems where the exact method is most needed it is not very 
laborious, for it often requires no more than half a dozen probabilities; and if 
we are willing to forego the benefit of the automatic check by means of the 
total (1.0000) we require only the probabilities from the observed samples to 
the end of the tail in which they lie, or until we meet a probability of 0.0000. 

It is often sufficient to calculate only one term in the series, i.e., the prob- 
ability of the observed pair of samples: 

(1). When the observed samples are at the end of the series, i.e., contain 
zero in one cell. 


(2). When the probability of the observed samples alone is greater than 
0.025 and we desire merely a statement of significance, not the exact value 
of P. If the observed samples are in a long tail, therefore, it saves labor to 
calculate their probability first. If it is below 0.025, one must, of course, 
calculate also the probabilities of the rarer samples in the same tail. 


Precise Assessment of Chances and Odds 


Sometimes, before further investigation is carried out, it is desirable to 
weigh the available evidence very precisely. In Example 25 of Section B the 
data are: 


Group Rh negative 


(a) 6 oF 53 
(b) 14 42 56 


Total 20 89 109 


Chi square (corrected for continuity) = 2.55; 34P from chi square is between 
0.10 and 0.05, nearer the latter. If $P = 0.025 is the criterion, the difference 
between the two groups is not significant, but the observer might say: 
“Although the difference does not reach the conventional standard of signi- 
ficance, I should like to pursue the investigation. I should, however, like to 
know the chances of my being led astray, i.e., of seeking for a real difference 
when no real difference exists. In problems like the present one, I should 
not pursue the investigation if the observed samples were in the opposite tail, 
i.e., if they differed in the opposite way from what I expected—in this instance 
a higher proportion of Rh negative in Group (a) than in Group (6)”. 

The problem being a one-sided comparison (Section A4; Section C, Note 5), 
the observer can be informed that, if he pursues investigations in which 4P 
from chi square is between 0.05 and 0.10, he will in the long run be led astray, 
in the sense specified above, in between 5 and 10% of his investigations. (If 
it were a two-sided comparison he would be led astray in from 10 to 20% of 
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his investigations.) It may, however, be desirable to calculate the prob- 
ability precisely. The probabilities of the observed samples and of those that 
differ more, in the same direction, are: 


Samples 


Probability 


We can therefore replace the rough estimate, 5 to 10% error, by the precise 
statement—5 .42%. 

In this instance the exact probability has not added much to the knowledge 
gained from chi square; but in many cases, especially where samples are small, 
a verdict of nonsignificance corresponds to a surprisingly high probability, i.e., 
the chances of fruitless investigation are very great if the verdict is disregarded. 
In Example 18 of Section B the data were: 


Recoveries 


Sulphonamide alone 1 
Sulphonamide-—penicillin 8 7 


Table V shows that there is no significant difference, but does not provide a 
probability statement. The exact method gives: 


Samples Probabilities 

(a) 5:0/4:11 0.0081 

(b) 4:1/5:10 0.0894 

(c) 3:2/6: 9 0.2981 

2. e) 1:4/8: 7 0.1916 (observed samples) 

(f) 0:5/9: 6 0.0298 

1.0002 


The probability, P, for (e) the observed samples and (f) the other possible 
rarer samples in the same tail = 0.1916 + 0.0298 = 0.2214. 


The implications of this result are stated in Example 18. 


Note 14—Cur SQUARE IN CONTINGENCY TESTS 


In Example 20 of Section B the data were: 


Users of DDT Nonusers of DDT 


Soldiers with scabies fe) 29 @ 23 52 
Soldiers without scabies c) 64 d) 36 100 
Total 93 59 


(The cell contents are lettered for convenience of reference.) 


6: 47/14 : 42 0.0371 

5 :48/15 :41 0.0130 

4: 49/16 : 40 0.0034 

3 : 50/17 : 39 0.0006 

2: 51/18 : 38 0.0001 
1: 52/19 : 37 0.0000 
— 0 : 53/20 : 36 0.0000 
0.0542 

Deaths 

Total 
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This is a fourfold table, and chi square (corrected for continuity) was calculated 
by formula and found to be 0.66. The formula is not an approximation but 
a condensation of the step-by-step method, to be shown here because it 
displays the rationale of the test, and because it is necessary for tables larger 
than fourfold. The steps are: 


(1). Assume that treatment has no effect. Then the best estimate of the 
proportion of scabies patients is 52/152. 


(2). On that assumption, calculate the number of scabies patients to be 
expected in a sample of 93, i.e., (52/152) of 93 = 31.81. This is the ‘expected’, 
‘hypothetical’, or ‘theoretical’ value, t, corresponding to the observed or actual 
value, a, 29. Likewise, or by subtraction from the marginal totals, find the 
t values for the other three cells. (a is, of course, not to be confused with the 
cell letter (a).) 

(3). Find the difference between a and t, indicated by a ~ t. 


(4). Reduce each a ~¢t by 0.5. This is the correction for continuity 
(Notes 10 and 11), and is not to be used for tables larger than fourfold. 


_ (5). Square each of the values found in (4) and divide each square by the 
corresponding f. 


(6). Add together the items found in (5), to give chi square. 
In the present example: 


(c) 
Actual (a) 64 
Theoretical (¢) 61.19 
a~t J 2.81 
a~t—0O.5 2.31 2.31 
(a ~t — 0.5)*/t §.34/31.81 5§.34/20.19 5.34/61.19 
=0.17 =0.26 =0.09 
x2, ie., chi square corrected for continuity = 0.17 + 0.26-+ 0.09 + 0.14 = 
0.66. 


Degrees of Freedom 


The rule given in Example 20 is: number of degrees of freedom = (number 
of rows of cells minus one) X (number of columns of cells minus one). For a 
fourfold table there is therefore one degree of freedom. The rule is easy to 
apply without explanation, but the meaning of the term ‘freedom’ can perhaps 
be elucidated by a simple example. In a fourfold table, to calculate the ¢ 
values it is sufficient to calculate ¢ for any one cell, because the marginal totals, 
along with the one ¢ value, determine what the other three ¢ values must be, 
i.e., they are not independent or ‘free’. 

It might be asked why Table VII is not constructed according to numbers of 
cells; but the rule concerning degrees of freedom shows that a 12-cell table 
containing two rows and six columns has five degrees of freedom, whereas if it 
contains four rows and three columns it has six degrees of freedom. Moreover, 


we 
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chi square is used for manv problems besides testing contingency tables, and 
for all purposes the random sampling probabilities of chi square depend on 
the degrees of freedom as defined for each type of problem. 

Probabilities from Chi Square 

Table VII shows that, for one degree of freedom, a chi square of 0.66 has a 
probability, P, between 0.30 and 0.50. More precisely determined from the 
table of Kendall and Yule (30), Appendix Table 4A, it is 0.4166. Applying 
the exact (factorial) method to our present contingency table, P is found to 
be 0.2079. 

There is no fundamental discrepancy between these two results because the 
exact method gives the probability for one tail only (the tail in which the 
observed samples lie), whereas chi square gives the probability from both 
tails. We can picture the possibilities displayed as in Note 13. Beginning 
with the observed samples and running in the same direction, i.e., a lower 
incidence of scabies in the users of DDT—29, 23; 28, 24; 27, 25; etc.—there 
would be one tail of the distribution. Running from the observed samples in 
the opposite direction, there would first be samples in which the incidence. 
was more and more similar in the users and nonusers of DDT; then there 


would be the other tail, in which there was a higher incidence of scabies in 
those who used DDT. 


Applying the chi square test to each of the samples we should find that it 
was lowest in the middle, where the samples were most alike, i.e., where there 
was most agreement between the observations and the hypothesis, for the 
hypothesis is that the samples came from the same population in respect of 
scabies incidence. Farther and farther in both directions chi square would 
increase, and chi square tables, e.g., Table VII, show simply the probability 
of finding chi square values greater than certain specified values. 


This explains why in the chi square test we take P = 0.05 and 0.01 as 
standards, instead of 0.025 and 0.005 as in the exact method. For fourfold 
tables we can halve the P from chi square to give an estimate of the prob- 
ability found by the exact method. In the present example P from chi 
square = 0.4166; }P = 0.2083. The exact P = 0.2079. A little dis- 
crepancy remains because asymmetry has not been corrected for, and the 
correction for continuity is not perfect. 

As another illustration, in Example 21 of Section B, chi square = 8.7. 
P = 0.00318 (Yule and Kendall (30), Appendix Table 4B), i.e., }P = 0.00159. 
Exact P = 0.00267. There is considerable discrepancy, but with larger 
samples and greater symmetry the agreement becomes closer, and the rules 
given in Section B, Examples 20 and 27, safeguard against serious error. 

Note.—For tables larger than fourfold we use chi square probabilities 
directly, without dividing them. 
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15—THE UsE OF THE STANDARD DEVIATION, ~/ 
IN COMPARISON OF SAMPLES 


In Example 25 of Section B the data are: 


Group 


Rh negative Rh positive 


(o 6 47 53 
(b 14 42 56 


89 109 


Total 


Chi square (corrected for continuity) = 2.55. P from Table VII is between 
0.20 and 0.10, ie., }P is between 0.10 and 0.05. More precisely (from 
Appendix Table 4B of Yule and Kendall) }P = 0.0552. The exact P = 
0.0542 (see Note 13). Therefore chi square gave a close approximation. 
Instead of chi square many workers still use in such comparisons the 
standard deviation or standard error, ~/ Npg, or its equivalent, in percentage 
form, ~A (100 — A)/N (see Note 8). If Rh negative individuals are 
called A— 
In Group (a), percentage of A’s 
In Group (0), percentage of A’s 
The standard deviations are— 
Group (a): 711.32 88.68/53 = 7V18.9408%; 
Group (6): 25 X 75/56 = +/33.4821%. 
Standard error of difference = ~/S.d.2 of Group (a) + S.d.* of Group (0d) 
= 718.9408 + 33.4821 = +~/52.4229 = 7.24%. 
Difference between (a) and (b) = 25.00 — 11.32 = 13.68%. 
Difference/S.e. of difference = 13.68/7.24 = 1.89. 


If a difference is more than twice its standard error (more precisely 1.96 S.e.) 
it is considered significant (P is less than 0.05). The present difference has 
not quite reached the minimum standard, but is near enough to make one 
think that a few more cases might easily show a significant difference. There 
is therefore considerable discrepancy between this and the chi square result. 
P from standard error = 0.0589 (Yule and Kendall (30), Appendix Table 
3); 4P = 0.0294. P from chi square = 0.1104; 3P = 0.0552. Exact P 
(corresponding to 3P from chi square or standard error) = 0.0542. 


The chief reason for the discrepancy is that, although the standard error 
test is based on the normal curve, no correction is made for continuity. Such 
a correction can be introduced, but it complicates the calculation, and even 
without that complication the standard error test is arithmetically no simpler 
than the calculation of chi square (corrected for continuity) from the formula 
for fourfold tables. There are, however, more weighty reasons for preferring 
chi square: 


6 X 100/53 = 11.32; 
14 X 100/56 = 25.00. 


89 
Total 
20 ea 
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(1). Although the standard error is often a good enough crude test of 
significance, we do not know when to cease trusting it; whereas the weak- 
nesses of chi square have been rather thoroughly explored (see, for instance, 
the rules given in Section B, Example 20). 


: (2). Chi square values from a number of contingency tests can be combined 
directly (Section B, Example 31). 


(3). Chi square is the test used for tables larger than fourfold. 
(4). Chi square can be used in many types of problem besides contingency 
tests. 

Note 16—CoONFIDENCE LIMITs OF SAMPLE DIFFERENCES* 


In Section B, Example 32, the users of DDT had 31.2% incidence of 
scabies, the nonusers had 39.0%. We assume that, in respect of incidence 
of scabies, there are two separate populations (see Fig. 5, in which, for con- 
venience of drawing, the populations have been indicated as symmetrical 


curves). 
POPULATION A POPULATION B 
(USERS OF DDT) 7 (NONUSERS OF DOT) 
\ 
\ 
\ 
Or 
\ 
ar \ 
OBSERVED SAMPLE (31.2 %) SAMPLE (39.0%) 


PERCENTAGE INCIDENCE OF SCABIES 
Fic. 5. To illustrate the conception of confidence limits of sample differences. 


We estimate the true values for these populations by finding the lower 
confidence limit (at P = 0.025) from our sample of users of DDT and the 
upper confidence limit from our sample of those who did not use DDT. This 
is equivalent to pushing the two populations apart until only 23% of samples 
in Population A would lie to the right of the observed sample of users of 
DDT, i.e., 974% would lie to the left of the observed sample, and similarly 
974% of samples in Population B would lie to the right of the observed sample 
of nonusers of DDT. 


Applying the same standard in all our estimates of this kind, we shall in 
973%, i.e. 0.975, of our judgments, correctly state that the population value 


* I am indebted to Dr. J. W. Hopkins and Mr. N. Keyfitz for the solution of this problem. 
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for A is not higher than the one so estimated. Likewise for our estimates for 
Population B. Therefore in 0.975 X 0.975, i.e., in approximately 95%, 
of our combined judgments regarding the populations, we shall correctly say 
that the true value for A is no higher than our estimate, and that the true 
value for B is no lower than our estimate. Our possible error is therefore not 
greater than 5%—a probability of 0.05. 

If a wider confidence belt is adopted (P = 0.005) the error is estimated 
similarly, i.e., by squaring 0.995 to give approximately 0.99, the probability 
of error being therefore 0.01 or 1%. 

It should be pointed out, of course, that different confidence limits for the 
two populations could be chosen to give the same percentage of possible error, 
but leading to different estimates of the possible population differences; but 
there is seldom need for such elaborate calculations, departing from the 
confidence limits that are ordinarily used. 


Note 17—REQUIRED SAMPLE SIZE—No POPULATION DIFFERENCE 
In Section B, Example 36, the data were: 


Survived 


Heated 4 15 
Cooled 6 10 


Total 15 10 25 


As a rough estimate it was shown that, even if heating and cooling did not 
differ in their influence on survival, probably about 1600 animals would be 
needed to show that the difference in survival rate in the two groups was 
unlikely to be more than 10%. 


The numbers actually required would vary greatly, depending on what was 
found as the samples grew, and our original samples give little information on 
which to build. Since we are supposing that both samples really belong to 
the same population we can reasonably expect that the enlarged samples will 
not show a significant difference, i.e., that chi square will not exceed 3.841; 
but there is a wide range of possible values below that. 

Detailed discussion is undesirable, but two contingency tables will illustrate 
some of the difficulties. They were prepared by increasing each group to 500 
animals and keeping the ratio of deaths to survivals (marginal totals) at 
15 : 10, ie., 3 : 2, as in the original pair of samples. The difference between 
confidence limits for survival percentage was found as in Example 32. 


(1) Died Survived Total 


206 500 


400 1000 
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in Heated 306 194 ud 7 
Cooled 294 
lue — 
Total 600 
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Chi square (corrected for continuity) = 0.50. Difference between confidence 
limits = 11.1%. 


(2) 


Heated 
Cooled 


Total 


Chi square (corrected for continuity) = 3.5. Difference between confidence 
limits = 14.6%. 

In each case larger samples would be necessary to attain the desired 10% 
difference between confidence limits. According to the ~/N relation (Example 
36) the first table indicates the need for approximately 1200 animals, the 
second table indicates about 2100 animals. 


All such estimates, therefore, indicate little more than orders of magnitude, 
and the easily computed value, 1600, is as useful as any other, in order to 
show an investigator whether his project is worth the effort or is possible at 
all in the given circumstances. 


Note 18—CALCULATION OF P FROM THE BINOMIAL EXPANSION 


Although Tables IA, IB, and II give confidence limits that are as accurate 
as are needed in most instances, greater accuracy may sometimes be required, 
and this can be obtained to any desired degree by use of the binomial ex- 
pansion. Starting with the value given in the table, we test its accuracy 
by calculating P. If it is too high or too low, we shift the confidence limit 
slightly in the appropriate direction and again calculate P. Several repetitions 
of this process will each time increase the precision of the confidence limit. 

Table IB shows that if the number of A’s in the sample is 19, and N is 43, 
the upper limit (P = 0.025) is 60.1%. The estimate, expressed to two 
decimal places, was 60.06%, i.e., the percentage not-A was 39.94%. 

Express the percentages as decimal fractions, i.e., as probabilities—p = the 
probability of A = 0.6006; g = the probability of not-A = 0.3994. Let 
a = the number of A’s in the sample, i.e., 19. 

We require, first, to find the probability of the observed sample, and then 
the probabilities of rarer samples in the same tail. The appropriate binomial 
formula is (p + g)*, where p = the probability of A, g = the probability of 
not-A, and N = the number of individuals in each sample. Then the 
probability of a sample that contains a individuals of Class A, is 


N! 
i alal (p)* (g)* 


— 
Died Survived Total 
285 215 500 
600 400 1000 
: 
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Where a number of terms are required, however, it is preferable to use a 
modification of the formulae proposed by Gray (14). 

Let K = log N! + N log gq; k = log p — log g. 

The logarithm of the probability of the observed sample is then: 

K + ak — log (N — a)! — loga! 

For successive (rarer) samples, if we are dealing with an upper limit we 
substitute for a the values (a — 1), (a — 2), If we are 
dealing with a lower limit we substitute for a the values (a + 1), (a + 2), 

These rules are obvious if we recall that for an upper 
limit the sample is in the left-hand (lower) tail of the distribution, i.e., P is the 
probability of occurrence of samples containing a and fewer A’s, whereas for 
a lower limit P is the probability of a and more A’s. 

In the present example we proceed thus: 

p = 0.6006; log p = 1.7785853; 

q = 0.3994; log g = 1.6014081. 

Seven-figure logarithms (Chambers, 23) are desirable for p and gq because 
K and entail the multiplying of logarithms. For the factorials, four-figure 
logarithms, as in Table VIII, have been found adequate. 


K = log 43! + 43 log 0.3994 = 52.7811 — 17.1395 = 35.6416. 


k = 1.7785853 — 1.6014081 = — 0.2214147 + 0.3985919 = + 0.1771772. 
The first term is: 


K + 19k — log 24! — log 19! = 35.6416 + a 
3.3664 — 23.7927 — 17.0851 = 2.1302. 

The subsequent terms are: 
K + 18k — log 25! — log 18!, 
K + 17k — log 26! — log 17!,......... down to the 10th term— 
K + 10k — log 33! — log 10: 

To evaluate the terms, convert to antilogarithins. Thus, the first term 
(the probability of the observed sample) = antilog 2.1302 = 0.01350. 

The sum of the 10 probabilities is therefore: 


0.01350 + 0.00682 + 0.00314 + 0.00131 + 0.00050 + 0.00017 + 0.00005 
+ 0.00001 + 0.00000 + 0.00000 = 0.02550, which is the required P. 


This is higher than the desired 0.025; therefore we increase the limit (i.e., 
push the distribution to the right) up to 60.16%, and proceed as before. 
(Note that K and & are changed, but the factorials are used over again). 
P is now 0.02470. This is too low and linear interpolation between 60.06% 
(P = 0.02550) and 60. 16% (P = 0.02470) gives 60.12%. This could now be 
tested as before, and, if desired, a closer approximation, to three or more 
decimal places, could similarly be found. 
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Useful rules for indicating the direction in which the limits require to be 
moved are: 


Lower limit — If P is too large make the limit lower; 
If P is too small make the limit higher. 

Upper limit — If P is too large make the limit higher; 
If P is too small make the limit lower. 


Often 8 or 10 terms of the expansion are sufficient because by that time the 
probabilities contain five or more zeros. 


When there are zero A’s in a sample of N, as in Table JA, it is easy to find 
the upper confidence limit corresponding to any desired value of P because 
the first formula in this Note becomes g¥ = P; therefore N log g = log P. 
Solve for g, subtract it from 1, and multiply by 100 to give the required 
upper limit. 


Note 19—TuHE Use oF ESTIMATES AS TRUE POPULATION VALUES 


This subject is discussed in Section B, Examples 1, 3, and 26. The data of 
Example 26 can be used to illustrate the effect. In a sample of 56 mentally 
defective children 14 (25%) had mothers who were Rh negative. It appeared 
from another author’s survey that the general population contained 15% Rh 
negative persons, and, if this were correct, P for the observed sample of 56 
would be 0.0343 (Note 8, Example 2)—the probability of chance occurrence 
of samples of 56 containing 14 or more Rh negative individuals. 


It was found, however, that the ‘population’ percentage (approximately 15) 
was really the percentage in a sample of 334. Taking the number of Rh 
negatives as 50, we can compare this sample with the sample of 56 thus: 


Rh negative Rh positive Total 


General population 50 284 334 
Mentally defective 14 42 56 


64 326 390 


Total 


Similarly, we can compare results if the sample from the general population 
had been 100 or 1000, with 15% Rh negative in each sample. P for chi 
square is determined by the table of Yule and Kendall (30), and 3P is taken, 
for comparison with P from the binomial, given above (0.0343). 


Chi square 4P 


Sample 

a 100 1.76 0.092 

Se 334 2.82 0.047 
1000 3.32 0.034 


n 
hi 
n, 
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With a sample of 1000 or more, there is in this case no appreciable difference 
from the P value obtained by treating the sample as if it were the actual 
population, but, when the proportions in the two classes are more unequal, 
still larger samples are needed. The safe rule is to find the sample from 
which the so-called population frequency has been estimated and compare 
by a contingency table. 


Note 20—TeEsts oF AccurACcYy OF CHI SQUARE IN 
FourRFOLD CONTINGENCY TABLES 


Various studies of the reliability of chi square have been made, such as 
those of Yates (29), Cochran (5), and Haldane (15), but they are hardly in a 
form suitable for direct use by most medical investigators. Table VIII of 
Fisher and Yates (11) provides a method of compensating for the errors of chi 
square in a wide range of samples; but for medical research workers it appears 
desirable to provide a somewhat simpler and more direct method, especially 
for samples not covered by the table of Fisher and Yates. Therefore our 
Tables IV, V, and VI were prepared, and, for samples not covered by those 
tables, rules for the use of chi square are presented in Example 20. These 
were derived partly from a rather considerable experience in comparison of 
chi square with the exact (factorial) test in the preparation of Tables IV, V, 
and VI, but largely from a series of specially designed tests. 


In the special tests N; and Ne were the total numbers of individuals in the 
respective samples. NN, was given the values 20, 30, 40, 50, 60, 70, 80, 90, 
100, 200, and 500. For each value of Ni, N2 was given values that were, as 
nearly as possible, 5, 10, 20, 40, 70, and 100% of N;. With each value of 
Ne, three arrangements of A’s and not-A’s in N, were used: (1) A = 0; 
(2) A = N,/4; (3) A = M,/2. .For arrangement (2) there were, of course, 
two arrangements in N, to be tested, according to whether the proportion of 
A’s in Ne was greater or less than in Nj. 

Under the conditions so specified, the frequencies in Ne were altered, step 
by step, to produce a minimum significant difference between the samples (P 
for one tail less than 0.025), and then one step farther to produce the maximum 
nonsignificant difference. (With some of the smaller samples, of course, only 
the latter was available.) For each contingency table P was found by the 
exact (factorial) method, and chi square (corrected for continuity) was 
calculated. The two results were compared by allocating the chi square value 
to its appropriate 3P interval in Table VII. 

It is upon the 500 contingency tables so investigated that the rules in 
Example 20 are chiefly based, and it will be seen that they justify the use of 
chi square over a very wide range of samples, provided that no greater 
precision is claimed than is provided by the 3P intervals. A more precise 
estimate was attempted by interpolation in the chi square table of Yule and 
Kendall (30), for comparison with the exact P values. Even with the largest 
samples it was found that, where the distributions were markedly skew, it 
would be unsafe to claim greater precision than is afforded by the $P intervals. 
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Note 21—PPEPARATION AND ACCURACY OF THE TABLES 


Table IA was derived from the formula g¥ = P—see Note 18. 


Tables IB and II, Graphs 1 to 6 (= Figs. 6 to 11) were prepared from 
Table VIII of Fisher and Yates (11). A preliminary computation of the 
required values with the aid of a slide rule was followed by machine computa- 
tion, and discrepancies between the two sets of values were investigated. 
Graphs of all parts of the tables were drawn and some slight irregularities were 
found. These were investigated by recomputation and where this gave the 


same value again an estimate of its error was made by use of the binomial 
expansion. 


A few tests of accuracy were made by application of Scheffé’s (24) method 
to Thompson’s (26) tables, and by the use of Pearson’s (21) Tables of the 
Incomplete Beta-function in evaluating the terms of the binomial expansion; 
but it was soon found desirable to evaluate the terms directly by the method 
shown in Note 18. This was done not only for the values suspected of error 
but for other values selected from various sections of the tables, making a 
total of more than a hundred values so tested, in 93 of which the true values 
were approached by a series of steps as in Note 18. 


The values, which had been originally computed with six or more figures, 
were then rounded off for the tables, and from the tests it would appear that 


the error in the tabulated values is seldom as great as + 1 in the last 
decimal place. 


The intervals between N values in Tables IA, IB, and II are somewhat 
irregular because they were chosen as the work progressed, in order to avoid 
too great gaps between successive values of the confidence limits without 
unduly increasing the amount of computation. 


Table III was prepared by linear interpolation between the last two entries 
of Table VIII1 of Fisher and Yates (11). 


Table IV was computed by the exact method from logarithms of factorials 
with seven decimal figures. The probabilities were found from five-figure 
antilogarithms and rounded off to four decimal places for the table. Summa- 
tion of the probabilities to unity gave an automatic check. 


Note that P for a contingency table O : N/O : N is not shown because it is 
unity. P values for A :(N — A)/A:(N — A), where A is not zero (e.g., 
2 : 13/2 : 13), are given, and represent the probability of the observed sample 
plus the probabilities of samples in either tail, the distribution being 
symmetrical, with the observed samples at the mode. 


Table V was computed by the exact method from logarithms of factorials 
with four decimal figures, the probabilities being checked by summation. 
Where the summation threw doubt on the precision, seven-figure logarithms 
were used. The probabilities in the table can be accepted as seldom in error 
by more than + 1 in the fourth place. 
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Table VI—Many entries were derived from exact probabilities, computed 
by factorials (four-figure logarithms), the others were found by chi square, 
corrected for continuity and tested for significance by Table VIII of Fisher 
and Yates. 


Note 22—FAcTORIALS 


To recall the meaning of “‘factorial’’ note that factorial 4, expressed as 
|4 or 4!, is 4X 3X 2X1. The factorial of 1 is 1, and so is the factorial 
of zero. 


The direct use of factorials entails heavy multiplication and division; 
therefore logarithms of factorials are used. Table VIII contains logarithms 
of factorials of numbers up to 1000, the logarithms being given to four decimal 
places, which are sufficient for most purposes. Seven-figure logarithms are 
given in the tables of Fisher and Yates (11, Table XX X—numbers up to 300) 
and of Pearson (20, Table XLIX—numbers up to 1000). 


Note 23—RANnpdoM SAMPLING TECHNIQUES 


The disk-sampling mentioned in Section A2, and described more fully in 
Note 1, reveals two essential features of random sampling techniques: 


(1). Since the population (persons, animals, or other individuals) cannot 
itself be shuffled and mixed in thorough random fashion, it is represented by 
an artificial population that can be randomized, e.g., heads and tails on coins, 
spots on dice, numbers on cards or disks, or the items in a table of random 
numbers, such as that of Fisher and Yates (11)—the most dependable method 
because the numbers have been thoroughly tested for bias. 


(2). The technique must insure that each individual has an equal chance 
of being taken into the sample. 


Two further rules should be noted: 
(1). As far as possible plan the whole experiment, sine a clinical or a 


laboratory experiment, at the outset, including the random allocation of 
treatments. 


(2). Having used a proper technique do not reject a sampling result because 
it seems nonrandom. As is well known, such extreme results occasionally 
occur by pure chance. 


Tables of Random Numbers 


The table of Fisher and Yates (six pages) contains two-figure numbers 
arranged in blocks to facilitate reading, for example: 


72 39 27 67 
00 65 98 50 
06 10 89 20 
65 90 77 47 


39 07 16 29 
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To use the table, the observer selects anywhere in it a row, column, or 
diagonal, without previously inspecting the numbers themselves, and then 
takes the numbers in that row, column, or diagonal as they occur, passing 
from block to block without interruption. If only 0 to 9 are needed, only 
single figures are taken, e.g., the right-hand digits in the first column of the 
block shown above—6, 3, 6, 9, 5. If three (or more) digits are needed, one 
uses three (or more) adjacent columns or rows, e.g., from the first (left-hand) 
column and the left half of the second column in the above block—267, 430, 
160, 96, 653. 


Study of a few examples will enable the reader to devise methods suitable 
to his particular problem. 


Example (1) 
Patients are expected to arrive, one at a time, over a period of weeks or 


months. Two treatments, A and B, are to be tested on equal numbers—20 
patients each. 


In coin tossing let heads represent Treatment A, tails Treatment B, and 
let the successive tosses represent patients in order of arrival. After the first 
20 heads (or tails) all Treatments A (or B) will have been allocated, and the 
remaining patient or patients will, of course, fall into the other class—B (or A). 


In dice throwing let odd numbers represent Treatment A, even numbers 
Treatment B; and likewise in card dealing or disk sampling. 


In a table of random numbers use one row or column, and record the odd or 
even digits as they occur, counting zero as even, to allow equal chances for 
odd and even. Thus the right-hand column in the block shown above gives 
7 (odd), 0 (even), 0 (even), 7 (odd), 9 (odd). 

If the individuals are already present, e.g., 40 hospital patients or animals, 
number them, 1 to 40, in any convenient order, e.g., by hospital beds or 
animal cages, and allocate treatment as above. 


If the individuals are to be divided according to sex, age (adults and 
children), or other features, make the random allocation separately in each 
group. 

When treatments, A and B, are to be compared it is not uncommon to apply them to 
alternate patients in order of arrival, or to patients in alternate beds, or animals in alternate 
cages—A to the first, third, fifth, and so on, B to the second, fourth, sixth, and soon. Several 
possibilities of bias in such an arrangement may be considered, e.g., many diseases have 
rhythms of severity and if, unknown to the investigator, the severity is waning and he 
alternates the treatment in order of the patients’ arrival, the second member of each pair, 
receiving Treatfhent B, will tend on the average to have a milder attack than the first, who 
received Treatment A. Alternate animal cages likewise may differ in exposure to air currents, 
temperature, or light. 


These various possibilities may be of little or no consequence, or the investigator may try 
in various ways to compensate for factors that he thinks may introduce bias, but the only way 
to guard against hidden (unsuspected ) bias, and thereby avoid present and future doubt, is 
strictly random sampling. When an investigator employs a nonrandom method as if equivalent 
to a random technique, the onus is on him to prove it justifiable, and this would entail an 
additional, usually large, investigation. 
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If more than two treatments are to be tested, coin tossing is rather 
complicated. Dice throwing can be used, e.g., for three treatments, let 
Treatment A be represented by one spot and six spots, B by two and five, C by 
three and four. In using a set of randomly distributed items, such as numbers 
from cards, disks, or a table, each treatment must be represented by an equal 
number of items, the others being disregarded when they turn up; e.g., in 
allocating six treatments by sampling from 100 numbered cards or random 
numbers, disregard Nos. 97, 98, 99, 100 (00 in a table of random numbers) 
because each treatment will then be represented by 16 numbers. 

Treatments A, B, C, D, E, and F are each to be tested on five animals. 
The 30 animals are numbered in any convenient order, and let us suppose 
that the random numbers in a row or column (or from cards or disks) are, in 
order: 03, 92, 18, 27, 00, 46, 57, 99, 16, 96, 56, 30, 33, 72, 85, 22, 84, 64,... 
A simple method, which is not unduly wasteful of numbers, is to note what 
remainder would be left if each number were divided by 6 (the number of 
treatments). Let the remainders represent treatments: 1 = A, 2 = B, 
3=C,4=D,5=£,0=F. By this method the above random numbers 
would allot treatments as follows: 


Random number Remainder Treatment Serial No. of animal 


as 


and so on. As soon as any treatment-group has received five animals, 
remainders indicating that treatment are, of course, disregarded. 
Example (3) 

Five animals from a numbered set of 40 are to receive Treatment A, the 
remaining 35 are to be used as untreated controls, or to receive Treatment B. 
In a row or column of two-digit random numbers neglect 00 and all entries 
above 40. If, for example, the row contains 26, 72, 39, 27, 67, 90, 29, 16, 
animals bearing the serial numbers 26, 39, 27, 29, and 16 would receive Treat- 
ment A because they have been met first in the search. 


03 3 
92 2 2 7 
18 0 3 
27 3 4 
00 Omit 
46 4 5 
57 3 6 
99 Omit = 
16 4 7 | 
96 0 . 
) 
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Example (4) 

From each of 15 litters three animals are used, to compare Treatments A, 
B,and C. In single-digit random numbers neglect zeros and allot the treat- 
ments according to remainders after dividing by 3, as in Example (2), thus: 


Sample No. (1) (1) (2) 2) — 3) — — — — ) G) 


In Sample (1) animal No. 1 receives C, animal No. 2 receives B. Therefore 
the remaining animal, No. 3, must receive A. In Sample (2) animal No. 1 
receives B (the third letter), anima! No. 2 receives A; therefore animal No. 3 
receives C. In Sample (3) the treatments are C, A, and therefore B. In 
Sample (4) animal No. 1 receives C, and we must omit the three succeeding 
C’s, allocating B to animal No. 2, and so on. 

If the samples are of twins, pairs of animals or parallel pairs of bacterial 
culture plates, the procedure is similar but simpler, for even numbers can 
represent A and odd numbers B, zero being counted as even. 

Example (5) 

Let us suppose that there are 234 animals in stock and that we require a 
random sample of 40. Running down a random number column we could 
take the first 40 numbers between 001 and 234, but this would waste numbers, 
and we can assign four random numbers to each animal, using numbers up to 
4 X 234 (i.e., 936). Thus, animal No. 1 would be represented by 001, 002, 
003, 004; animal No. 2 would receive the next four, and animal No. 234 would 
receive 933, 934, 935, 936. We should then take the first 40 numbers, in a 
column or row, that fell within this range, disregarding numbers outside the 
range, among which would be included 000 (equivalent to 1000). When an 
animal had already been selected, if one of its numbers appeared again it 
would be disregarded. 


Note 24—RECOMMENDATIONS REGARDING MATHEMATICAL TABLES 
AND OTHER SOURCES OF INFORMATION 


Tables. For nearly everyone who applies statistical tests to medical or 
biological data the tables of Fisher and Yates (11) should be considered indis- 
pensable. 


Among four-figure logarithm tables such as are supplied to colleges, time is 
saved by those that contain antilogarithms, as do those of Bottomley (2). 
Where four figures are not accurate enough, five-figure tables with anti- 
logarithms (Castle, 3) will often be sufficient and are much easier to work 
with than seven-figure tables. Sometimes, however, the latter are needed 
(Chambers, 23). 

Many small books of tables contain squares, square roots, and reciprocals, 
but an investigator frequently finds the need for greater accuracy than they 
provide, and Barlow’s tables (6) save much time and effort. 
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Graphs. Those who are using chi square for contingency tables and 
other purposes will find useful the graphs prepared by Bliss (1), Yule and 
Kendall (30). 

Information on Methods. Although this Note is not a study guide, a little 
help may be desirable. Fisher’s Statistical Methods for Research Workers (9) 
and The Design of Experiments (10) are, of course, basic works for investigators 
in any biological field. As regards qualitative statistics in particular, Statis- 
tical Methods shows the application of chi square to more complicated problems 
than have been discussed in this article, and it discusses the Poisson distribu- 
tion. The Design of Experiments introduces the reader to the use of random 
sampling in biological experiments. 

The study of literature on confidence limits could profitably begin with the 
article by Clopper and Pearson (4) and the one by Stevens (25) in which 
appeared a preliminary version of Table VIII1 of Fisher and Yates. 
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Index of Subjects 
References are to pages and Examples (Ex.) 
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Ex. 1-13 
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Bimodal distributions, 64 
Binomial expansion, 67, 92 
Biological assay, 14 

Blood substitute, 21, 58 


Cause, fallacies, 16, 41, 42, 81 
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Chances, 7 

Chi square, in argument from sample to 
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general nature, 26, 37 
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limitations of, 78 
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probabilities, 27, 39, 78, 80, 88 
proportional to sample size, 60 
summation of, 55 
table, interpolation, 55 
tests of accuracy, 95 

Clinical experiments, difficulties, 34 

Color blindness, 18, 76 

Comparisons, one-sided, 11, 21, 31, 72 

Confidence belt, 74 

Confidence limits of population values, 9, 73 
of sample differences, 56, 

Contingency tables, 10 

Contingency tests, chi square, see Chi square 
exact method, 82 
general procedure, 81 

Corneal ulcer, prontosil treatment, 47 


Data, combination of, 52 
Degrees of freedom, 27, 39, 87 
Diphtheria antitoxin, 41 


Estimates used as population values, 15,44,94 


Factorials, 97 
in binomial expansion, 93 
in contingency test, 82 
Frequency, 6 
distribution, 70 


Gastric acidity, 65 


Health departments, fallacies in reports, 16 

Hemophilus influenzae, meningitis, 33 
penicillin, 32 

Homogeneity, 10, Ex. 29-31 

Hookworm disease, 19 

Hospital records, fallacies, 16, 49 


Individuals, double observations, 20 


Majority, significant, 19, 20, 24 

Measurements, desirability, 17, 20, 63 
treated as enumeration data, 61 
unreliability, 5 

Meningitis, Hemophilus influenzae, 33 
sulphonamidesand penicillin, 33, 86 

Mode, 70 

Motion sickness, 13, 45, 62 


Nonsignificance, meaning, 8, 34, 44, 56, 71 
Normal frequency curve, 74 
Numbers required in samples, 22, 30, 57, 91 


Odds, 7, 67 
One-sided comparisons, 11, 21, 31, 72 
Orbit, nerves, 17 


P, binomial, 70, 71, 92 
chi square, 27, 39, 78, 80, 88 
exact contingency tests, 84 
general use, 7 
Pairing, dangers, 21, 43 
tests after, 42 
Percentages, dangers, 7, 17, 18, 19, 30, 46 
Plague, meningitis, 19 
Poliomyelitis virus, in feces, 40 
in monkeys, 24 
Population, 5 
assumptions, 15,19, 41,44, 


infinite, 67 
Probabilities, 7, 67 (see also P) 
combination, 54 


Random sampling, 6, 11, 16, 34, 45, 65, 97 
Rh factor, mental defect, 43, 44, 77, 85, 89, 94 


Samples, combination, 51 
comparison, Ex. 14-29 
dangerous use, 15, 19, 41, 44, 94 
differences, confidence limits, 56, 90 
fallacious comparison of extreme, 47 
random, see Random sampling 
size, 11, 22, 30, 57-61, 91 
Scabies and DDT, 37, 57, 59 
Sex ratios of stillborn children, 50 
Shock and temperature, 33, 60, 91 
Significance, 7, 11, 71, 72 
investigator’s choice of standards, 
Sizes of sample required, 11, 22, 30, 57-61, 91 
Soporific drugs, 63 
anes Sone (standard error), 25, 45, 


Sternum, synostosis, 41 


Tables, preparation and accuracy, 96 
recommendations, 100 
Toxicity tests, 14 
Tuberculosis, in navy, 19 
synostosis of sternum, 41 


Visual acuity, 47, 61 
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TABLE IA 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATa— 
NUMBER OF A’s IN SAMPLE = 0 


Upper limits (%) Upper limits (%) 
i, 4 10 .025 005 ep 10 .025 .005 
\ 
1 90.0 97.5 99.5 35 6.4 10.0 14.0 
2 68.4 84.2 92.9 40 5.6 8.8 12.4 
3 53.6 70.8 82.9 45 5.0 7.9 3.3 
4 43.8 60.2 73.4 50 4.5 7:4 10.0 
5 36.9 $2.2 65.3 55 4.1 6.5 9.2 
6 31.9 45.9 58.6 60 3.8 6.0 8.5 
7 28.0 41.0 53.1 65 3.5 5.5 7.8 
8 25.0 36.9 48.4 70 Pe 5.1 7.3 
9 22.6 33.6 44.5 75 3.0 4.8 6.8 
10 20.6 30.9 41.1 80 2.8 4.5 6.4 
11 18.9 28.5 38.2 85 2.7 4.2 6.0 
12 26.5 35.7 90 4.0 5.7 
13 16.2 24.7 33.5 95 2.4 3.8 5.4 
14 15:2 23.2 31.5 100 2.3 3.6 S23 
15 14.2 21.8 29.8 110 2.1 3:3 4.7 
16 13.4 20.6 28.2 120 1.9 3.0 4.3 
17 12.7 19.5 26.8 130 1.8 2.8 4.0 
18 12.0 18.5 25.5 140 1.6 2.6 3.7 
19 11.4 17.6 24.3 150 1.5 2.4 3.5 
20 10.9 16.8 23.3 160 1.4 2.3 a3 
21 10.4 16.1 170 23 3.1 
22 9.9 15.4 21.4 180 ee: 2.0 2.9 
23 9.5 14.8 20.6 190 1.2 1.9 2.7 
24 9.2 14.3 19.8 200 Bas 1.8 2.6 
25 8.8 13.7 19.1 220 1.0 1.7 2.4 
26 8.5 13.2 18.4 250 92 1.5 2.4 
27 8.2 12.8 17.8 300 .76 Be 1.8 
28 7.9 42.3 17.2 400 37 .92 1.3 
29 7.6 11.9 16.7 500 .46 74 1.1 
30 7.4 11.6 16.2 700 oan 53 .76 
1000 .23 37 .53 


Note.—For mode-of use see Example 7. N = total number of individuals in the sample. 


P = probability. For N > 1000, divi 


values for N = 1000 by N/1000. To find limits corre- 


sponding to other values of P, see the last paragraph of Note 18 in Section C. 
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TABLE IB 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DaTa— 
NUMBER OF A’s IN SAMPLE: 1 TO 20 


Lower limits (%) Upper limits (%) 


XY .005 025 025 


Number of A's in sample = 1 


~ 


COI 


ne 


3 
84 
63 
51 
42 
36 
32 
28 
25 
23 
21 
19 
18 
17 
16 
15 
14 
13 
13 
12 
12 
11 
11 
10 
097 
094 
090 
.084 
079 
072 
063 
056 
051 
042 
036 
032 
028 
025 
023 
021 
017 
016 
013 
0084 
0051 
0025 


. 


8 
9 


w 


6 


Note.—For mode of use see Examples 1 to7. N = total number of individuals in sample. 
P = probability. For N > 1000, divide values for N = 1000 by N/1000. 
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40 
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36 
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33 
31 
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23 | 
22 
21 i 
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11 
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TABLE IB—Continued 
CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA 


NUMBER OF A’s IN SAMPLE: 1 TO 20—Continued 


Upper limits (%) 


Ne 


MAAK OD 
MAMAN ANNAN 


COM 


ANNNN NAN et 


Stet et et 


Lower limits (%) 


Nest OD 


Number of A's in sample = 2 


N = total number of individuals in sample. 


ples 1 to 7. 
, divide values for N = 1000 by N/1000. 


Note.—For mode of use see Exam 
= probability. For N > 1000 


. 
106 
.005 | 025 | 10 10 | 025 | .005 
ae 4 3.0 6.8 14.2 85.8 93.2 97.0 | 
— 5 2.3 5.3 11.2 75.4 85.3 91.7 
a ee 6 1.9 4.3 9.3 66.7 77.8 85.6 
om 7 1.6 3.7 7.9 59.6 71:0 79.1 
a s 1.4 3.2 6.9 53.9 65.1 74.2 
— 9 1.2 2.8 6.1 49.1 60.0 69.3 
ne 10 1.1 2.5 5.4 45.0 55.6 64.8 
es am 11 .98 2.3 4.9 41.5 51.8 60.9 
12 .89 2.1 4.5 38.6 48.4 57.4 
_ 13 .82 1.9 4.2 36.0 45.5 54.2 
— 14 .76 1.8 3.9 33.7 42.8 51.3 
a. 15 71 1.7 3.6 31.7 40.5 48.7 
- 16 .67 1.6 3.4 30.0 38.4 46.3 
17 .63 1.5 3.2 28.4 36.5 44.2 
a 18 .59 1.4 3.0 27.0 34.7 42.2 
7 ae 19 .56 1.3 2.8 25.7 33.1 40.4 
eo 20 .53 1.2 2:7 24.5 31.7 38.7 
ee 21 .50 1.2 2.6 23.4 30.4 37.2 
. 22 .48 1.1 2.4 22.4 29.2 35.8 
. 23 .46 1.1 2.3 21.5 28.1 34.4 
24 44 1.0 2. 
— 25 42 .98 
41 .95 2. 
.39 91 2. 
sist 28 .38 .88 1. 
29 36 1. 
30 .35 1. 
32 .33 1. 
35 .30 .70 1. 
45 .23 54 1.2 11.4 15.2 19.0 
50 .49 1.1 10.3 13.7 17.2 
- 60 17 41 .89 8.6 11.5 14.5 
ie 70 15 .35 .76 7.4 10.0 12.6 
80 .13 .30 .67 6.5 8.7 11.1 
Ms 90 .12 .27 .59 5.8 7.8 9.9 
100 .10 .24 53 5.2 7.0 8.9 
120 .086 .20 44 4.4 5.9 7.5 
150 .069 .16 .36 3.5 4.7 6.0 
a 200 .052 .12 .27 2.6 3.6 4.6 
— 300 034 081 18 1.8 2.4 3.1 
500 .048 ll 1.1 1.4 1.8 
1000 .010 .024 .053 .53 .72 92 


Upper limits (%) 


N = total number © mae in sample. 


P = probability. For N > 1000, divide values for N = 1000 by N/1 


: 
z 


4 
4 
bs 
z 
= 
2 
3 
2 
= 
5 


Lower limits (%) 


Note.—For mode of use see Examples 1 to 7. 


Number of A’s in sample = 3 
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P .005 .025 | .10 .025 .005 
- 6 6.6 11.8 20.2 79.8 88.2 93.4 a 
7 5.5 9.9 17.0 72.1 81.6 88.3 7 
8 4.8 8.5 14.6 65.5 75.5 |. 83.0 _ 
9 4.2 7.5 13.0 59.9 70.1 78.1 HY se 
10 3.7 6.7 11.6 55.2 65.2 73.5 &§ 
11 3.3 6.0 10.5 51.1 61.0 69.4 | : 
12 3.0 5.5 9.6 47.5 57.2 65.7 _ 
13 2.8 5.0 8.8 44.4 53.8 62.2 i . 
14 2.6 4.7 8.2 41.7 50.8 59.0 _ 
15 2.4 4.3 7.6 39.3 48.1 56.1 i 
16 2.2 4.0 7.1 37.1 45.6 53.5 _ 
17 2.1 3.8 6.7 35.2 43.4 51.1 
18 2.0 3.6 6.3 33.4 41.4 48.9 _ 
19 1.9 3.4 6.0 31.9 39.6 46.9 i] 
20 1.8 3.2 5.6 30.4 37.9 45.0 _ 
21 1.7 3.1 5.4 29.1 36.4 43.3 i : 
22 1.6 2.9 5.1 27.9 34.9 41.7 a. 
23 1.5 2.8 4.9 26.8 33.6 40.2 _ 
24 1.5 2.7 4.7 25.8 32.4 38.8 h 
25 + oe 2.5 4.5 24.8 31.2 37.5 g 
26 1.3 2.4 4.3 23.9 30.2 36.2 _ 
27 1.3 2.4 4.2 23.1 29.2 35.1 _ 
28 1.2 2.3 40 1 22.3 28.2 34.0 Hl 
29 1.2 2.2 3.9 21.6 27.4 33.0 _ 
30 1.2 2.1 3.7 20.9 26.5 32.0 | : 
32 1.1 2.0 3.5 19.7 25.0 30.3 - 
35 :99 1.8 3.2 18.1 23.1 28.0 if 
40 -86 1.6 2.8 16.0 20.4 24.9 if 
45 1.4 2.5 14.2 18.3 22.4 
50 69 1.3 2.2 12.9 16.6 20.3 Ht 
60 .57 1.0 1.9 10.8 13.9 17.2 _ 
70 149 ‘89 1.6 9.3 12.0 14.9 
80 1.4 8.2 10.6 13.1 
90 1.2 7.3 9.4 11.7 
100 134 62 1.1 6.6 8.5 10.6 | 
120 .28 .52 .92 5.5 7.1 8.9 _ 
150 4-4 5-7 7-1 | 
200 17 31 .55 3.3 4.3 5.4 
300 ‘21 2.2 2.9 3.6 
500 112 1.3 1.7 2.2 | 
1000 .062 .67 .87 1.1 
| 


Upper limits (%) 


= 1000 by N/1000. 


s 1 to7. N = total number of individuals in sample. 


ple 


P = probability. For N > 1000, divide values for N 
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CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA 
NUMBER OF A’s IN SAMPLE: 1 TO 20—Continued 
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Lower limits (%) 


Note.—For mode of use see Exam 


Number of A’s in sample = 4 


108 
= | 025 | 10 10 | | .005 
ee! 8 10.0 15.8 24.0 76.0 84.2 90.0 
NG 9 8.7 13.8 21.1 69.9 78.8 85.5 
a 10 7.7 12.2 18.8 64.6 73.8 80.9 
4 11 6.9 11.0 17.0 60.0 69.3 76.8 
re 12 6.2 9.9 15.4 55.9 65.2 72.8 
. 13 5.7 9.1 14.2 52.3 61.5 69.1 
i 14 5.3 8.4 13.1 49.2 58.1 65.8 
ice. 15 4.9 7.8 12.2 46.4 55.1 62.8 
16 4.6 7.3 11.4 43.9 52.4 60.0 
5 17 4.3 6.8 10.7 41.6 49.9 57.4 
er 18 4.0 6.4 10.1 39.6 47.7 55.0 
me 19 3.8 6.1 9.5 37.7 45.5 52.7 
| 20 3.6 5.8 9.0 36.0 43.6 50.6 
ni 21 3.4 5.5 8.6 34.5 41.9 48.8 
ib cele 22 3.2 5.2 8.2 33.1 40.3 47.0 
hy 7 23 3.1 5.0 7.8 31.8 38.8 45.4 
ea) 24 3.0 4.8 7.5 30.6 37.4 43.8 
a 25 2.8 4.6 7.2 29.5 36.1 42.4 
| a 26 2.7 44 |. 6.9 28.4 34.9 41.1 
eae: 27 2.6 4.2 6.6 27.5 33.8 39.8 
te aaa 28 2.5 4.0 6.4 26.5 32.7 38.6 
i. 29 2.4 3.9 6.1 25.7 31.6 37.4 
. 30 2.3 3.8 5.9 24.9 30.7 36.4 
=— 32 2.2 3.5 5.6 23.4 29.0 34.4 
= 35 2.0 3.2 5.1 21.6 26.8 31.8 
° ie 37 1.9 3.0 4.8 20.5 25.4 30.3 
. 40 1.7 2.8 4.4 19.0 23.7 28.3 
2a 45 1.5 2.5 3.9 17.0 21.2 25.4 
a 50 1.4 2.2 3.5 15.4 19.2 23.1 
a 55 1.3 2.0 3.2 14.0 17.6 21.2 
_— 60 1.1 1.9 2.9 12.9 16.2 19.6 
70 98 1.6 2.5 11.1 14.0 16.9 
80 "85 1.4 2.2 9.7 12.3 14.9 
90 16 1.2 1.9 8.7 11.0 13.4 
100 68 11 1.8 7.8 9.9 12.1 
120 1.5 6.5 8.2 10.1 
a 150 45 2B 1.2 5.3 6.7 8.2 
wit 200 34 55 87 4.0 5.0 6.2 
| 300 "22 36 58 2.6 3.4 4.1 
500 13 "22 "35 1.6 2.0 2.5 
1000 .067 AT .80 1.0 1.3 


Upper limits (%) 
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NUMBER OF A's IN SAMPLE: 1 TO 20—Continued 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA— 
Lower limits (%) 


Note.—For mode of use see Examples 1 to7. N = total number of individuals in sample. 


P = probability. For N > 1000, divide values for N = 1000 by N/1000. 


Number of A's in sample = 5 
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.005 | 10 .025 | | 
10 12.8 18.7 26.7 73.3 81.3 87.2 Hh 
11 11.4 16.8 24.0 68.2 76.6 83.2 &§ 
12 10.3 15.2 21.8 63.8 72.3 79.2 i 
13 9.4 13.9 20.0 59.8 68.4 75.5 ! 
14 8.7 12.8 56.3 64.9 72.1 
15 8.0 11.8 17.2 53.2 61.6 68.9 _ 
16 7.5 11.0 16.0 50.3 58.7 65.9 a 
17 7.0 10.3 15.0 47.8 55.9 63.1 _ 
18 6.6 9.7 14.2 45.5 53.5 60.6 _ 
19 6.2 9.2 13.4 43.4 51.2 58.3 
ug 
20 5.8 8.7 12.7 41.5 49.1 56.0 
21 5.5 8.2 12.1 39.7 47.2 54.0 i 
22 5.3 7.8 11.5 38.1 45.4 52.1 _ 
23 5.0 7.5 11.0 36.6 43.7 50.3 _ 
24 4.8 71 10.5 35.2 42.2 48.6 _ 
25 4.6 6.8 10.1 34.0 40.7 47.0 _ 
26 4.4 6.6 9.7 32.8 39.4 45.5 ia 
27 4.2 6.3 9.3 31.7 38.1 44.2 
28 4.1 6.1 9.0 30.6 36.9 42.8 } 
29 3.9 5.8 8.6 29.6 35.8 41.6 ij 
30 3.8 5.6 8.3 28.7 34.7 40.4 _ 
32 3.5 5.3 7.8 27.1 32.8 38.3 _ 
35 3.2 4.8 71 24.9 30.3 35.4 | 
37 3.0 4.5 6.7 23.6 28.8 33.8 es 
40 2.8 4.2 6.2 22.0 26.8 31.5 _ 
42 2.7 4.0 5.9 21.0 25.6 30.2 _ 
45 2.5 3.7 5.5 19.6 24.1 28.4 7 
50 2.2 3.3 4.9 17.8 21.8 25.8 g 
55 2.0 3.0 4.5 16.2 20.0 23.7 i 
60 1.8 2.8 4.1 14.9 18.4 21.9 i} 
70 1.6 2.4 3.5 12.8 15.9 18.9 q 
80 1.4 2.1 3.1 11.3 14.0 16.7 \ 
90 1.2 1.8 2.7 10.1 12.5 14.9 _ 
100 1.1 1.6 2.4 9.1 11.3 13.5 
120 ‘91 1.4 2.0 7.6 9.5 11.4 ta 
150 .73 1.1 1.6 6.1 7.6 9.2 
200 54 1.2 4.6 5.7 6.9 
300 36 ‘54 81 3.1 3.8 4.6 
500 .22 .32 .49 1.8 2.3 2.8 
1000 ‘11 "16 124 ‘92 1.2 1.4 
4 
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TABLE IB—Continued 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA: 


NUMBER OF A’s IN SAMPLE: 1 TO 20—Continued 


Upper limits (%) 


Lower limits (%) 


Number of A’s in sample = 6 


divide values for N = 1000 by N/1000. 


Note.—For mode of use see Examples 1 to7. N = total number of individuals in sample. 


P = probability. For N > 1000 


: 
% 
P .005 | | .10 | .025 | .005 
| ae 12 15.2 21.1 28.8 71.2 78.9 84.7 
vm Fe 13 13.8 19.2 26.4 66.9 74.9 81.2 
SS 14 12.7 17.7 24.3 63.1 71.2 77.7 
bem 15 11.7 16.3 22.5 59.7 67.7 74.4 
opie. 16 10.8 15.2 21.0 56.6 64.6 71.4 
‘ as 17 10.1 14.2 19.7 53.7 61.7 68.5 
a 18 9.5 13.4 18.6 51.2 59.0 65.8 
hae 19 8.9 12.6 17.5 48.9 56.6 63.3 
ee 20 8.4 11.9 16.6 46.8 54.3 61.0 
7 21 8.0 11.3 15.8 44.8 52.2 58.8 
Pa 22 7.6 10.7 15.0 43.0 50.2 56.8 
5 23 7.2 10.2 14.3 41.3 48.4 54.9 
J) a 24 6.9 9.8 13.7 39.8 46.7 53.1 
he 25 6.6 9.4 13.1 38.3 45.1 51.4 
tee 26 6.3 9.0 12.6 37.0 ° 43.7 49.8 
ot ae 27 6.1 8.6 12.1 35.8 42.3 48.3 
aoe 28 5.9 8.3 11.7 34.6 41.0 46.9 
ae. 29 5.6 8.0 11.3 33.5 39.7 45.5 
en 30 5.4 7.7 10.9 32.5 38.6 44.3 
Ae 32 5.1 7.2 10.2 30.6 36.4 42.0 
~ 35 4.6 6.6 9.3 28.2 33.7 38.9 
. 37 4.4 6.2 8.7 26.7 32.0 37.1 
ae 40 4.0 5.7 8.1 24.8 29.8 34.6 
o 42 3.8 5.4 7.7 23.7 28.5 33.2 
ae 45 3.5 5.1 7.2 22.2 26.8 31.2 
a = 50 3.2 4.5 6.4 20.1 24.3 28.4 
ee 55 2.9 4.1 5.8 18.4 22.2 26.1 
Fag 60 2.6 3.8 5.3 16.9 20.5 24.1 
a 65 2.4 3.5 4.9 15.6 19.0 22.4 
oe 70 2.2 3.2 4.6 14.6 17.7 20.9 
is 80 2.0 2.8 4.0 12.8 15.6 18.4 
1 a 90 1.7 2.5 3.5 11.4 13.9 16.5 
Brite 100 1.6 2.2 3.2 10.3 12.6 14.9 
RUE 120 1.3 1.9 2.6 8.6 10.6 12.5 
Me 150 1.0 1.5 2.1 6.9 8.5 10.1 
ea 200 78 1.1 1.6 5.2 6.4 7.6 
300 52 ‘14 1.1 3.5 4.3 5.1 
500 31 ‘44 63 2.1 2.6 3.1 
1000 "15 22 32 1.1 1.3 1.6 


Upper limits (%) 


TABLE IB—Continued 
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NUMBER OF A’s IN SAMPLE: 1 TO 20—Continued 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA: 
Lower limits (%) 


Note.—For mode of use see Examples 1 to7. N = total number of individuals in sample. 


P = probability. For N > 1000, divide values for N = 1000 by N/1000. 


Number of A’s in sample = 7 


| 
a 
005 | 025 | 10 10 | 025 005 
14 17.2 23.1 30.4 69.6 76.9 82.8 ea: 
15 15.9 21.3 28.2 65.8 73.4 79.5 ey 
16 14.7 19.8 26.3 62.5 70.1 76.4 | : 
17 13.7 18.4 24.6 59.4 67.0 73.5 | 
18 12.8 17.3 23.1 56.6 64.2 70.7 
19 12.1 16.3 21.8 54.1 61.6 68.1 &g 
20 11.4 15.4 20.6 51.8 59.2 65.7 _ 
21 10.8 14.6 19.6 49.6 57.0 63.4 a 
22 10.2 13.8 18.7 47.7 54.9 61.3 _ 
23 9.8 13.2 17:8 45.8 52.9 59.2 
24 9.3 12.6 17.0 44.1 51.1 57.3 _ 
25 8.9 12.1 16.3 42.6 49.4 55.6 == 
26 8.5 11.6 15.6 41.1 47.8 53.9 . 
27 8.2 11.1 15.0 39.7 46.3 52.3 oF 
28 7.9 10.7 14.5 38.4 44.9 50.8 _ 
29 7.6 10.3 14.0 37.2 43.5 49.4 _ 
30 7.3 9.9 13.5 36.1 42.3 48.0 ee 
31 7.0 9.6 13.0 35.0 41.1 46.7 
32 6.8 9.3 12.6 34.0 40.0 45.5 | 
33 6.6 9.0 12.2 33.1 38.9 44.4 \q 
34 6.4 8.7 11.8 32.2 37.9 43.2 _ 
35 6.2 8.4 11.5 31.3 36.9 42.2 if 
37 5.8 8.0 10.8 29.8 35.2 40.3 i] 
40 5.4 7.3 10.0 27.7 32.8 37.6 gg 
42 5.1 7.0 9.5 26.4 31.4 36.1 _ 
45 4.7 6.5 8.9 24.8 29.5 34.0 _ 
47 4.5 6.2 8.5 23.8 28.3 32.7 | 
50 4.2 5.8 8.0 22.4 26.7 30.9 Ri, 
55 3.8 5.3 7.2 20.5 24.5 28.4 
60 3.5 4.8 6.6 18.8 22.6 26.2 _ 
65 3.2 4.4 6.1 17.4 20.9 24.3 _ 
70 3.0 4.1 5.7 16.2 19.5 22.7 _ 
80 2.6 3.6 4.9 14.3 17.2 20.1 _ 
90 2.3 3.2 4.4 12.7 15.4 18.0 q 
100 2.1 2.9 3.9 11.5 13.9 16.3 i 
120 1.7 2.4 3.3 9.6 11.6 13.7 _ 
150 1.4 1.9 2.6 7.7 9.4 11.0 : 
200 1.0 1.4 2.0 5.8 7.1 8.4 
300 68 ‘94 1.3 3.9 4.7 5.6 | 
500 41 ‘56 "78 2.3 2.9 3.4 
1000 "20 28 "39 1.2 1.4 1.7 ' 
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TABLE IB—Continued 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA— 


NUMBER OF A’s IN SAMPLE: 1 TO 20—Continued 


Upper limits (%) 


Lower limits (%) 


Number of A's in sample = 8 


Note.—For mode of use see Examples 1 to7. N = total number of individuals in sample. 
divide values for N = 1000 by N/1000. 


P = probability. For N > 1000 


112 
| .025 | 10 10 | .025 | .005 
a 16 19.0 24.6 31.8 68.2 75.4 81.0 
ig 17 17.7 23.0 29.8 64.9 72.2 78.1 
ie * 18 16.5 21.5 27.9 62.0 69.3 75.3 
eek 19 15.5 20.2 26.3 59.2 66.5 72.6 
ae 20 14.6 19.1 24.9 56.8 64.0 70.1 
ae 21 13.8 18.1 23.6 54.4 61.6 67.7 
shee. 22 13.1 17.2 22.5 52.3 59.3 65.5 
et 23 12.5 16.4 21.4 50.3 57.3 63.4 
oe 24 11.9 15.6 20.5 48.5 55.3 61.4 
ae 25 11.4 15.0 19.6 46.7 53.5 59.5 
wee 26 10.9 14.3 18.8 45.1 51.8 57.8 
“a 27 10.4 13.8 18.1 43.7 50.2 56.1 
a 28 10.0 13.2 | 17.4 42.2 48.7 54.5 
aaa 29 9.6 12.7 16.8 40.9 47.2 53.0 
ae 30 9.3 12.3 16.2 39.7 45.9 51.6 
ee 31 9.0 11.9 15.7 38.5 44.6 50.2 
+ a 32 8.7 11.5 15.1 37.4 43.4 48.9 
= 33 8.4 11.1 14.7 36.4 42.3 47.7 
bo: a 34 8.1 10.7 14.2. 35.4 41.2 46.5 
we 35 7.9 10.4 13.8 34.5 40.1 45.4 
ee 37 7.4 9.8 13.0 32.7 38.2 43.4 
a 40 6.8 9.0 12.0 30.4 35.6 40.6 
‘a 42 6.5 8.6 11.4 29.1 34.1 38.9 
% a 45 6.0 8.0 10.6 27.3 32.1 36.6 
Ee 47 5.8 7.6 10.2 26.2 30.8 35.2 
Se 50 5.4 7.2 9.5 24.7 29.1 33.4 
ea 55 4.9 6.5 8.7 22.6 26.7 30.6 
es 60 4.5 5.9 7.9 20.8 24.6 28.3 
Le 65 4.1 5.5 7.3 19.2 22.8 26.3 
a 70 3.8 5.1 6.8 17.9 21.3 24.6 
cae 80 3.3 4.4 5.9 15.7 18.8 21.7 
a 90 2.9 3.9 5.2 14.0 16.8 19.4 
a 100 2.6 3.5 4.7 12.7 15.2 17.6 
a 120 2.2 2.9 3.9 10.6 12.7 14.8 
a 150 1.7 2.3 3.1 8.5 10.2 12.0 
ay 200 1.3 1.7 2.3 6.4 7.7 9.0 
ae 300 ‘86 1.2 1.6 4.3 5.2 6.1 
500 52 69 2.6 3.1 3.7 
ee 1000 .26 .35 47 1.3 1.6 1.8 


Upper limits (%) 


N = total number of individuals in sample. 


divide values for N = 1000 by N/1000. 
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Lower limits (%) 


Note.— For mode of use see Examples 1 to 7. 


P = probability. For N > 1000 


Number of A’s in sample = 9 


113 
= 
a 
N\P .005 10 10 025 005 
iff 
i 
18 20.4 26.0 32.9 67.1 74.0 79.6 e 
19 19.2 24.4 31.0 64.2 71.1 76.9 _ 
20 18.1 23.0 29.3 61.5 68.5 74.3 | 
21 17.1 21.8 27.8 59.0 66.0 71.9 _ 
22 16.2 20.7 26.4 56.7 63.6 69.5 _ 
23 15.4 19.7 25.2 54.6 61.5 67.4 _ 
24 14.7 18.8 24.1 52.6 59.4 65.3 
25 14.0 18.0 23.0 50.8 57.5 63.4 es 
26 13.4 17.2 22.1 49.1 55.7 61.5 g 
27 12.9 16.5 21.2 47.5 54.0 59.8 ) 
28 12.3 15.9 20.4 45.9 52.3 58.1 _ 
29 11.9 15.3 19.7 44.5 50.8 56.5 gg 
30 11.4 14 19.0 43.2 49.4 55.0 i 
31 11.0 rf. 18.4 41.9 48.0 53.6 ae 
32 10.7 13.7 17.8 40.7 46.8 52.2 : 
33 10.3 13.3 17.2 39.6 45.5 50.9 ee 
34 10.0 12.9 16.7 38.6 44.4 49.7 _ 
35 9.7 12.5 16.2 37.5 43.3 48.5 
37 9.1 11.8 15.3 35.7 41.2 46.3 
40 8.4 10.8 14.1 33.2 38.5 43.4 ) ; 
42 7.9 10.3 13.4 31.7 36.8 41.6 Be 
45 7.4 9-6 12.4 29.7 34.6 39.2 ) 
47 7.0 9.2 11.9 28.5 33.3 37.7 _ 
50 6.6 8.6 11.2 26.9 31.4 35.7 ) 
55 6.0 7.8 10.1 24.6 28.8 32.8 ) 
60 5.5 7A 9.3 22.6 26.6 30.4 gg 
65 5.0 6.5 8.5 21.0 24.7 28.2 
70 4.6 6.1 7.9 19.5 23.0 26.4 } 
80 4.0 5.3 6.9 17.2 20.3 23.3 : 
90 3.6 4.7 6.1 15.3 18.1 20.9 _ 
100 3.2 4.2 5.5 13.8 16.4 18.9 , 
120 2.7 3.5 4.6 11.6 13.8 15.9 , 
150 2.1 2.8 3.7 9.3 11.1 12.9 
200 1.6 2.1 2.7 7.0 _ 8.4 9.7 
250 1.3 1.7 2.2 5.6 6.7 7.8 | 
300 1.1 1.4 1.8 4.7 5.6 6.5 
500 63 83 1.1 2.8 3.4 4.0 
1000 31 41 54 1.4 1.7 2.0 ae 
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TABLE IB—Continued 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA— 


NUMBER OF A’s IN SAMPLE: 1 TO 20—Continued 


Upper limits (%) 


Lower limits (%) 


Number of A's in sample = 10 


N = total number of individuals in sample. 


Note.—For mode of use see Examples 1 to 7. 
P = probability. For N > 1000, divide values for N = 1000 by N/1000. 


i14 
.005 025 10 
oe 20 21.8 27 
32.0 63.6 70.2 75.8 
a 22 24.4 30.5 61.1 67.8 73. 
18.5 23.2 29.0 5 3 
po 24 17.6 22.1 3.8 65.5 71.2 
27.7 56.7 63.3 69.1 
a 25 16.8 21.1 26.5 
. 54.8 61. 
16.1 20.2 25.4 52.9 $0 65.1 
19.4 24.4 51.2 57.6 63.3 
ae 3 14.8 23.5 49.6 55.9 61.6 
ee 17.9 22.6 48.0 54.3 59.9 
i” 30 13.7 17.3 21.8 46.6 52.8 58.4 
16.7 21.1 45.3 51.4 
32 12.7 16.1 36.9 
ae 33 12.7 20.4 44.0 50.0 55.4 
a 33 12.3 15.6 19.7 42.8 48.7 54.1 
; 19.1 41.6 47.5 52.8 
bie. 35 11.5 14.6 18.6 
40.5 
9.5 12.1 15.4 34.3 39.5 44.3 
- 14.3 32.1 37.1 41.7 
47 8.4 10.7 
$0 7.9 10.0 12.8 29.1 33.7 38.0 
: 60 6.5 8.3 10.6 24.5 28.5 32.4 
i. 6s 6.0 7.6 9.8 22.7 26.5 30.1 
7.1 9.1 21.2 
6 8.5 19.8 23.2 26.4 
¢-2 18.6 21.8 24.9 
3 7.0 16.6 19.5 22.3 
100 3.8 4 
4.9 6.3 15.0 17.6 20.2 
13.7 16.1 18.5 
oo 4.1 5.2 12.5 
= 14.8 17.0 
150 2.5 4.8 11.6 13.7 15.7 
4.2 10.1 11.9 13.7 
i 200 1. 2.4 3.1 7.6 9.0 10.4 
5 1.9 2.5 6.1 é 
300 1.2 1.6 2.1 5.1 6.0 70 
0 
96 1.2 3.1 3.6 4.2 
1000 
| 


Upper limits (%) 


N = total number in sample. 


P = probability. For N > 1000, divide values for N = 1000 by N/1 


TABLE IB—Continued 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA 
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NUMBER OF A’s IN SAMPLE: 1 TO 20—Continued 


Lower limits (%) 


Note.—For mode of use see Examples 1 to 7. 


Number of A's in sample = 11 


P .005 .025 10 025 .005 
| 
22 22.9 28.2 34.6 |, 65.4 71.8 77.1 lj 
23 21.8 26.8 32.9 63.0 69.4 74.8 - Uv 
24 20.7 25.6 31.5 60.7 67.2 72.6 = 
25 19.8 24.4 30.1 58.6 65.1 70.6 P 
26 18.9 23.4 28.9 56.7 63.1 68.6 ! 
27 18.1 22.4 27.7 54.9 61.2 66.7 } 
28 17.4 21.5 26.6 53.1 59.4 64.9 ; 
29 16.7 20.7 25.7 51.5 57.7 63.2 &@ 
30 16.1 19.9 24.8 50.0 56.1 61.6 _— 
31 15.4 19.2 23.9 48.6 54.6 60.0 _ 
32 14.9 18.6 23.1 47.2 53.2 58.6 -_ 
33 14.4 18.0 22.4 45.9 51.8 57.1 
34 14.0 17.4 21.7 44.7 50.5 55.8 ; 
35 13.5 16.9 21.0 43.5 49.3 54.5 — 
36 13.1 16.4 20.4 42.4 48.1 53.3 _ 
37 12.7 15.9 19.8 41.4 47.0 52.1 _ 
38 12.4 15.4 19.3 40.4 45.9 51.0 _ 
39 12.0 15.0 18.8 39.4 44.9 49.9 _ 
40 11.7 14.6 18.3 38.5 43.9 48.8 . 
42 11.1 13.9 17.4 36.8 42.0 46.9 _ 
45 10.3 12.9 16.2 34.5 39.5 44.2 et: 
47 9.8 12.3 15.5 33.2 38.0 42.6 ij 
50 9.2 11.5 14.5 31.3 36.0 40.3 a2 
52 8.8 11.1 13.9 30.2 34.7 39.0 
55 8.3 10.4 13.1 28.6 33.0. 37.1 ' 
60 7.6 9.5 12.0 26.4 30.4 34.3 . 
65 7.0 8.8 11.1 24.4 28.3 31.9 
70 6.5 8.1 10.3 22.8 26.4 29.9 i 
75 6.0 7.6 9.6 21.3 24.7 28.0 i, 
80 5.6 7.1 9.0 20.0 23.3 26.4 myc 
90 5.0 6.3 7.9 17.9 20.8 23.7 a 
100 4.5 5.6 7.1 16.1 18.8 21.5 
110 4.0 5.1 6.5 14.7 17.2 19.6 4 
120 3.7 4.7 5.9 13.5 15.8 18.1 . 
130 3.4 4.3 5.5 12.5 14.6 16.7 _ 
150 2.9 3.7 4.7 10.8 ~12.7 14.6 " 
170 2.6 3.3 4.2 9.6 11.3 12.9 
200 2.2 2.8 3.5 8.2 9.6 11.1 
250 1.7 2.2 2.8 6.6 7.7 8.9 
300 1.5 1.8 2.4 5.5 6.5 7.4 
400 1.1 1.4 1.8 4.1 4.9 5.6 : 
500 1.1 1.4 3.3 3.9 4.5 
1000 43 1.7 2.0 2.3 
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TABLE IB—Continued 
CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA 


NUMBER OF A’s IN SAMPLE: 1 TO 20—Continued 


Upper limits (%) 


Lower limits (%) 


Number of A's in sample = 12 


N = total number of individuals in sample. 
/1000. 


divide values for N = 1000 by N 


Note.— For mode of use see Examples 1 to 7. 


P = probability. For N > 1000, 


116 
4 
= .005 | .025 | | .025 | .005 
eS 24 24.0 29.1 35.3 64.7 70.9 76.0 
a 25 22.8 27.8 33.8 62.5 68.7 73.9 
a 26 21.8 26.6 32.4 60.4 66.6 71.9 
- 27 20.9 25.5 31.1 58.5 64.7 70.0 
a 28 20.0 24.5 29.9 56.6 62.8 68.1 
5 eee 29 19.2 23.5 28.8 54.9 61.1 66.4 
em, 30 18.5 22.7 27.7 53.3 59.4 64.7 
ae 31 17.8 21.9 26.8 51.8 57.8 63.1 
name 32 17.2 21.1 25.9 50.4 56.3 61.6 
33 16.6 20.4 25.1 49.0 54.9 60.1 
ea 34 16.1 19.8 24.3 47.7 53.5 58.7 
etd 35 15.6 19.2 23.5 46.5 52.2 57.3 
Reet 36 15.1 18.6 22.9 45.3 51.0 56.1 
Bey 37 14.6 18.0 22.2 44.2 49.8 54.8 
eo 38 14.2 17.5 21.6 43.1 48.7 53.7 
gee 39 13.8 17.0 21.0 42.1 47.6 52.5 
ee 40 13.4 16.6 20.4 41.2 46.5 51.4 
Ee 42 12.7 15.7 19.4 39.3 44.6 49.4 
aun 45 11.8 14.6 18.1 36.9 41.9 46.6 
ay 47 11.3 14.0 17.3 35.5 40.4 44.9 
ae 50 10.6 13.1 16.2 33.5 38.2 42.6 
ee 52 10.1 12.5 15.6 32.3 36.8 41.1 
a 55 9.5 11.8 14.7 30.6 35.0 39.2 
ae : 60 8.7 10.8 13.4 28.2 32.3 36.3 
Sg . 65 8.0 9.9 12.4 26.1 30.0 33.7 
a 70 7.4 9.2 11.5 24.3 28.0 31.6 
a 75 6.9 8.5 10.7 22.8 26.3 29.6 
a 80 6.4 8.0 10.0 21.4 24.7 27.9 
: 90 5.7 71 8.9 19.1 22.1 25.0 
: 100 5.1 6.4 8.0 17.3 20.0 22.7 
 . 110 4.6 5.8 7.2 15.7 18.3 20.7 
ae 120 4.2 5.3 6.6 14.5 16.8 19.1 
wil 130 3.9 4.9 6.1 13.4 15.6 17.7 
ee 150 3.4 4.2 5.3 11.6 13.6 15.5 
ae 170 3.0 3.7 4.6 10.3 12.0 13.7 
cai 200 2.5 3.1 3.9 8.8 10.2 11.7 
eS 250 2.0 2.5 3.2 7.0 8.2 9.4 
— 300 1.7 2.1 2.6 5.9 6.9 7.9 
a 400 1.2 1.6 2.0 4.4 5.2 5.9 
daecnae 500 99 1.2 1.6 3.5 4.2 4.8 
ee. 1000 .50 62 78 1.8 2.1 2.4 
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TABLE IB—Continued 
CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA 


NUMBER OF A's IN SAMPLE: 1 TO 20—Continued 


Upper limits (%) 


Lower limits (%) 


Number of A’s in sample = 13 


N = total number of individuals in sample. 


for N = 1000 by N/1000. 


Note.—For mode of use see Examples 1 
P = probability. For N > 1000, divide 


117 
= 
N\P .005 | 025 .10 10 | 025 | 005 
26 24.9 29.9 35.9 64.1 70.1 75.1 _ 
27 23.8 28.7 34.5 62.0 68.0 73.2 
28 22.8 27.5 33.1 60.1 66.1 71.2 
29 21.9 26.4 31.9 58.3 64.3 69.4 
30 21.1 25.5 30.8 56.6 62.6 67.7 _ 
31 20.3 24.6 29.7 55.0 60.9 66.1 _ 
32 19.6 23.7 28.7 53.5 59.3 64.5 
33 18.9 22.9 27.8 52.0 57.8 63.0 | 
34 18.3 22.2 26.9 50.7 56.4 61.5 _ 
35 17.7 21.5 26.1 49.4 55.1 60.1 oe 
36 17.2 20.8 25.3 48.2 53.8 58.8 _ 
37 16.6 20.2 24.6 47.0 52.5 57.5 _ 
38 16.2 19.6 23.9 45.9 51.3 56.3 ) 
39 15.7 19.1 23.3 44.8° 50.2 55.1 
40. 15.3 18.6 22.6 | 43.8 49.1 54.0 a 
2.4 17.6 21.5 41.8 47.1 51.9 
45 13.4 16.4 20.0 39.3 44.3 48.9 
47 12.8 15.6 19.1 37.7 42.6 47.2 _ 
50 12.0 14.6 17.9 35.6 40.3 44.7 _— 
52 11.5 14.0 17.2 34.3 38.9 43.2 a 
55 10.8 13.2 16.2 32.6 37.0 41.2 _ 
60 9.9 12.1 14.9 30.0 34.2 38.1 
65 9.0 11.1 13.7 27.8 31.8 35.5 ae 
70 8.4 10.3 12.7 25.9 29.7 33.2 ) 
75 7.8 9.6 11.8 24.2 27.8 31.2 _ 
80 7.3 9.0 11.1 22.8 26.2 29.4 ; 
85 6.8 8.4 10.4 21.5 24.7 27.8 ) 
90 6.4 7.9 9.8 20.4 23.4 26.4 _ 
95 6.1 7.5 9.3 19.3 22.3 25.1 ; 
100 5.8 7.1 8.8 18.4 21.2 23.9 
110 5.2 6.4 8.0 16.8 19.4 21.9 . 
120 4.8 5.9 7.3 15.4 17.8 20.1 
130 4.4 5.4 6.7 14.2 16.5 18.7 ) 
150 3.8 4.7 5.8 12.3 14.4 16.3 
170 3.3 4.1 5.1 11.0 14.4 
200 2.8 3.5 4.4 9.3 10.9 12.4 
250 2.3 2.8 3.5 7.5 8.7 9.9 ee 
300 1.9 2.3 2.9 6.3 7.3 8.3 as 
400 1.4 1.7 2.2 4.7 5.5 6.3 
500 1.1 1.4 1.7 3.8 4.4 5.0 | 
1000 .56 69 .87 1.9 2.2 2.5 
ues i 


Upper limits (%) 


N = total number of individuals in sample. 


divide values for N = 1000 by N/1000. 


TABLE IB—Continued 


NUMBER OF A’s IN SAMPLE: 1 TO 20—Continued 


8 
| 
3 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA 
Lower limits (%) 


Note.— For mode of use see Examples 1 to 7. 


Number of A’s in sample = 14 
P = probability. For N > 1000 


118 
P .005 .025 | .10 .10 .025 | .005 
ray 36.5 63.5 69.3 74.3 
33.1 61.6 67.5 72.4 
7 a 33.8 59.9 65.7 70.7 
32.7 58.2 64.0 69.0 
Pe 31.6 56.6 62.3 67.3 
| 30.5 55.1 60.8 65.8 
4 29.6 53.6 59.3 64.3 
28.7 52.3 57.9 62.9 
4 27.8 51.0 56.5 61.5 
a 27.0 49.7 55.2 60.2 
oo 26.3 48.5 54.0 58.9 
ee 25.5 47.4 52.8 57.7 
+. 24.9 46.3 51.7 56.5 
i 23.6 44.3 49.5 54.3 
22.0 41.6 46.6 51.3 
ba 21.0 40.0 44.9 49.4 
: 19.7 37.7 42.5 46.9 
By 18.9 36.4 41.0 45.3 
ae 17.8 34.5 39.0 43.2 
17.2 33.4 37.7 41.8 
aes 16.3 31.8 36.0 40.0 
ae 15.0 29.5 33.5 37.2 
— 13.9 27.5 31.3 34.9 
i 12.9 25.7 29.3 32.7 
Ae: 12.1 24.2 27.6 30.9 
eo 11.4 22.8 26.1 29.2 
ee 10.7 21.6 24.7 27.7 
=a 9.6 19.5 22.4 25.1 
oe). 8.8 17.8 20.4 23.0 
Bee 8.0 16.3 18.8 21.2 
a 7.4 15.1 17.4 19.6 
a 6.8 14.1 16.2 18.3 
ae 5.6 11.6 13.4 15.2 
a 4.8 9.9 11.5 13.0 
oF 3.8 8.0 9.2 10.5 
Rh 3.2 6.6 7.7 8.7 
2.4 5.0 5.8 6.6 
4.0 4.7 5.3 
.95 2.0 2.3 2.7 
7 


: 
= 
- 
+S) 
5 
2 
~ 
< 
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TABLE IB—Continued 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA— 


NUMBER OF A’s IN SAMPLE: 1 TO 20—Continued 


Upper limits (%) 


Lower limits (%) 


Number of A’s in sample = 15 


N = total number eae in sample. 


P = probability. For N > 1000, divide values for N = 1000 by N/1 


Note.—For mode of use see Examples 1 to 7. 


119 
we .005 | .025 | .10 | .025 | .005 
la 
30 26.5 31.3 37.0 63.0 68.7 73.5 _ 
31 25.5 30.2 35.7 61.3 66.9 71.8 _ 
32 24.6 29.1 34.5 59.6 65.3 70.1 | 
33 23.7 28.2 33.3 58.0 63.7 68.5 _ 
34 22.9 27.3 32.3 56.5 62.1 67.0 _ 
35 22.1 26.4 31.3 55.1 60.7 65.5 _ 
36 21.5 25.6 30.4 53.7 59.2 64.1 
37 20.8 24.9 29.5: 52.4 57.9 62.8 -_ 
38 20.2 24.2 28.6 51.2 56.6 61.4 | 
39 19.6 23.5 27.9 50.0 55.4 60.2 
40 19.1 22.8 27.1 48.9 54.2 59.0 . 
42 18.0 21.7 25.8 46.8 52.0 56.7 ie 
45 16.7 20.1 24.0 43.9 49.0 53.6 ) 
47 15.9 19.2 22.9 42.2 47.1 51.6 _ 
50 14.9 18.0 21.5 39.8 44.6 49.0 _ 
52 14.3 17.2 20.6 38.4 43.1 47.4 ; - 
55 13.4 16.2 19.4 36.4 41.0 45.2 
57 12.9 15.6 18.7 35.2 39.7 43.8 
60 12.2 14.8 17.7 33.6 37.9 41.8 
65 11.2 13.6 16.3 31.1 35.2 39.0 
70 10.4 12.6 15.1 29.0 32.9 36.5 
75 9.7 11.7 14.1 27.2 30.8 34.3 
80 9.0 11.0 13.2 25.5 29.0 32.3 _ 
85 8.5 10.3 12.4 24.1 27.4 30.6 
90 8.0 9.7 11.7 22.8 26.0 29.0 
100 7.2 8.7 10.5 20.6 23.5 26.3 ae 
110 6.5 7.9 9.5 18.8 21.5 24.1 | 
120 5.9 7.2 8.7 17.3 19.8 22.2 = 
130 5.5 6.6 8.0 16.0 18.3 20.6 
140 5.1 6.2 7.5 14.9 17.1 19.2 7 
150 4.7 5.7 7.0 13.9 16.0 18.0 
170 4.1 5.0 6.1 12.3 14.1 15.9 
200 | 3.5 4.3 5.2 10.5 12.1 13.6 
250 2.8 3.4 4.2 8.4 9.7 11.0 | 
300 2.3 2.8 3.5 7.0 8.1 9.2 ; 
400 2; 2.1 2.6 5.3 6.1 6.9 i 
500 1.4 1.7 2.1 4.2 4.9 5.6 . 
1000 ‘69 1.0 2.1 2.5 2.8 


= 
5 
38 
= 


TABLE IB—Continued 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA 


NUMBER OF A's IN SAMPLE: 1 TO 20—Continued 


Upper limits (%) 


Lower limits (%) 


Number of A’s in sample = 16 


N = total number of individuals in sample. 


by N/1000. 


divide values for N = 1000 


Note.—For mode of use see Examples 1 to 7. 


P = probability. For N > 1000 


120 
Ne 005 | 025, | 10 10 | 025 005 
2 eae 32 27.2 31.9 37.4 62.6 68.1 72.8 
ey 33 26.3 30.8 36.2 60.9 66.4 71.1 
34 25.4 29.8 35.0 59.4 64.8 69.6 
en 35 24.5 28.9 34.0 57.9 63.3 68.1 
cA 36 23.7 28.0 32.9 56.4 61.9 66.6 
a8 37 23.0 27.2 32.0 55.1 60.5 65.2 
+o 38 22.3 26.4 31.1 53.8 59.2 63.9 
ay 39 21.7 25.7 30.2 52.6 57.9 62.6 
a 40 21.1 25.0 29.4 51.4 56.7 61.3 
aa 42 19.9 23.7 27.9 49.2 54.4 59.0 
‘Sine 45 18.5 22.0 26.0 46.2 51.2 55.8 
a 47 17.6 21.0 24.8 44.4 49.3 53.8 
i 50 16.4 19.6 23.2 41.9 46.7 51.1 
ee 52 15.8 18.8 22.3 40.4 45.1 49.4 
3 we 55 14.8 17.7 21.0 38.4 42.9 47.1 
57 14.3 17.1 20.3 37.1 41.5 45.6 
ee 60 13.5 16.2 19.2 35.4 39.6 43.6 
62 13.0 15.6 18.6 34.3 38.5 42.4 
65 12.4 14.9 17.7 32.8 36.9 40.7 
"Ag 67 12.0 14.4 17.1 31.9 35.9 39.6 
hd 70 11.4 13.8 16.4 30.6 34.4 38.1 
=. 75 10.6 12.8 15.3 28.6 32.3 35.8 
Pee 80 9.9 12.0 14.3 26.9 30.4 33.8 
a 85 9.3 11.2 13.4 25.4 28.8 31.9 
7 90 8.8 10.6 12.7 24.0 27.3 30.3 
a 100 7.9 9.5 11.4 21.7 24.7 27.5 
co 110 71 8.6 10.3 19.8 22.5 25.2 
a 120 6.5 7.9 9.4 18.2 20.7 23.2 
. 130 6.0 7.2 8.7 16.8 19.2 21.5 
oe 140 5.6. 6.7 8.1 15.7 17.9 20.0 
a. 150 5.2 6.2 7.5 14.6 16.7 18.8 
an 170 4.6 5.5 6.6 13.0 14.8 16.7 
a 200 3.9 4.7 5.6 11.0 12.7 14.2 
| 250 3.1 3.7 4.5 8.9 10.2 11.5 
300 2.6 3.1 3.7 7.4 8.5 9.6 
—. 400 1.9 2.3 2.8 5.6 6.4 7.2 
500 1.5 1.8 2.2 5.1 5.8 
1000 16 92 1.1 2.2 2.6 2.9 


. 
= = 
a ss 
. 
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& 
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TABLE 1B—Continued 
CONFIDENCE LIMITS FOR TWOFOLD’ CLASSIFICATION OF ENUMERATION DATA 


NUMBER OF A’s IN SAMPLE: 1 TO 20—Continued 


Upper limits (%) 


Lower limits (%) 


Number of A’s in sample = 18 


N = total number of individuals in sample. 


divide values for N = 1000 by N/1000. 


Note.—For mode of use see Examples 1 to 7. 


P = probability. For N > 1000 


i, 
122 
= 005 | 025 | 10 | 10 | 025 | 005 
; 
ae 36 28.5 33.0 38.2 61.8 67.0 71.5 
ioe 37 27.6 32.0 37.1 60.3 65.5 70.0 
ae 38 26.8 31.1 36.0 58.9 64.1 68.6 
Sa 39 26.0 30.2 35.0 57.6 62.8 67.3 
Due 40 25.3 29.4 34.1 56.3 61.5 66.0 
. ea 41 24.5 28.6 33.2 55.1 60.2 64.7 
a. 42 23.9 27.8 32.4 53.9 59.0 63.5 
a 43 23.3 27.1 31.6 52.8 57.8 62.3 
44 22.7 26.5 30.8 51.7 56.7 61.2 
a 45 22.1 25.8 30.1 50.6 55.6 60.1 
- | 46 21.6 25.2 29.4 49.6 54.6 59.0 
— 47 21.0 24.6 28.7 48.7 53.6 58.0 
a 48 20.6 24.1 28.1 47.8 52.6 57.0 
a 49 20.1 23.6 27.5 46.9 51.7 56.0 
Be 50 19.6 23.1 26.9 46.0 50.8 55.1 
ae 52 18.8 22-1 25.8 44.4 49.1 53.3 
oe 55 17.7 20.8 24.3 42.1 46.7 50.8 
a 57 17.0 20.0 23.4 40.8 45.2 49.3 
oa 60 16.1 19.0 22.2 38.8 43.2 47.2 
: 62 15.5 18.3 21.5 37.7 41.9 45.8 
e 65 14.8 17.4 20.4 36.0 40.2 44.0 
i 67 14.3 16.9 19.8 35.0 39.1 42.8 
a 70 13.6 16.1 18.9 33.6 37.5 41.2 
. 75 12.7 15.0 17.6 31.5 35.2 38.7 
E 80 11.8 14.0 16.5 29.6 33.2 36.6 
Pe 85 11.1 13.1 15.5 27.9 31.4 34.6 
aa 90 10.4 12.4 14.6 26.4 29.7 32.8 
2 100 9.3 11.1 13.1 23.9 26.9 29.8 
= eg 110 8.5 10.0 11.9 21.8 24.6 27.3 
120 7.7 9.2 10.9 20.0 22.7 25.2 
ats. 130 7.1 8.5 10.0 18.5 21.0 23.3 
.. 140 6.6 7.8 9.3 17.3 19.6 21.8 
a 150 6.1 7.3 8.7 16.1 18.3 20.4 
dak 170 5.4 6.4 7.6 14.3 16.2 18.1 
ae 200 4.6 5.4 6.5 12.2 13.9 15.5 
> aa 250 3.6 4.3 5.2 9.8 11.1 12.5 
Looe 300 3.0 3.6 4.3 8.2 9.3 10.4 
400 2.3 2.7 3.2 6.1 7.0 7.9 
ie 500 1.8 2.2 2.6 4.9 5.6 6.3 
1000 1.1 2.5 2.8 3.2 


i=) 
= 
n 
~ 
=< 
= 


TABLE IB—Continued 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA 


NUMBER OF A's IN SAMPLE: 1 TO 20—Continued 


Upper limits (%) 


Lower limits (%) 


Number of A’s in sample = 19 


for N = 1000 by N/1000. 


Note.—For mode of use see mples 1 to7. N = total number of individuals in sample. 
, divide values 


P = probability. For N > 1 


123 
P | 025 | 10 10 | 025 | .005 
38 29.1 33.5 38.6 61.4 66.5 70.9 . 
39 28.2 32.5 37.5 60.1 65.1 69.5 
40 27.4 31.6 36.5 58.7 63.8 68.2 
41 26.6 30.8 35.5 57.5 62.5 66.9 
42 25.9 30.0 34.6 56.2 61.3 65.7 
43 25.2 29.2 33.7 55.1 60.1 64.5 | 
44 24.6 28.5 32.9 53.9 58.9 63.3 _ 
45 24.0 27.8 32.1 52.9 57.8 62.2 ; 
46 23.4 27.1 31.4 51.8 56.7 61.1 -s 
47 22.8 26.5 30.7 50.8 55.7 60.0 ; 
48 22.3 25.9 30.0 49.9 54.7 59.0 
49 21.8 25.3 29.4 48.9 53.7 58.0 ie 
50 21.3 24.8 28.7 48.0 52.8 57.0 he 
52 20.4 23.8 27.6 46.3 51.0 55.2 me 
35 19.2 22.4 26.0 44.0 48.5 52.7 
57 18.4 21.5 25.0 42.6 47.0 51.1 oe 
60 17.4 20.4 23.7 40.6 44.9 48.9 7 
62 16.8 19.7 22.9 39.4 43.6 47.5 
65 16.0 18.7 21.8 37.6 41.8 45.6 
67 15.5 18.1 21.2 36.6 40.7 44.4 = 
70 14.7 17.3 20.2 35.1 39.1 42.7 aa 
75 13.7 16.1 18.8 32.9 36.7 40.2 
80 12.8 15.0 17.6 30.9 34.6 37.9 
85 12.0 14.1 16.5 29.2 32.7 35.9 
90 11.3 13.3 15.6 27.6 31.0 34.1 
100 10.1 11.9 14.0 25.0 28.1 31.0 e 
110 911 10.8 12.7 22.8 25.6 28.4 
120 8.3 9.9 11.6 21.0 23.6 26.1 
130 7.7 911 10.7 19.4 21.9 24.2 = 
140 71 8.4 9.9 18.0 20.4 22.6 
150 6.6 7.8 9.3 16.9 19.1 21.2 . 
170 5.8 6.9 - 8.2 14.9 16.9 18.8 
200 4.9 5.8 6.9 12.7 14.4 16.1 ; 
250 3.9 4.7 5.5 10.2 11.6 13.0 
300 . 3.3 . 3.9 4.6 8.5 9.7 10.9 
400 2.4 2.9 3.4 6.4 7.3 8.2 
500 1.9 2.3 2.7 5.1 5.9 6.6 
1000 ‘97 11 1.4 2.6 3.0 3.3 a 
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TABLE IB—Concluded 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA— 
NUMBER OF A’s IN SAMPLE: 1 TO 20—Conel 


Lower limits (%) Upper limits (%) 


-005 -025 ; ‘ 


Number of A’s in sample = 20 


BF AKLSS 


.6 
.0 
3 


mM AWS CH 


PO 


ru 


Note.—For mode of use see Examples 1 to7, N = total number of individuals in sample. 
P = probability. For N > 1000, dinde values for N = 1000 by N/1000. 
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| 005 

40 29 38,9 61 66 
41 28 37 $9 64 
42 28 36 

>s 43 27 35 57 62 

a 44 26 35 56 61 

tam 45 25 34 55 59 
ee. 46 25 33 54 58 

ei 47 24 32 52 57 62 
ee. 48 24 32 51 56 61 
ae 49 23 31 51 55 60 
Ber 50 23 30 50 54 59 
a 51 22 30 49 53 58 

: 52 22 29 48 52 57 
ei 53 21 28 47 ; 52 56 
2 a 54 21 28 46 51 55 

apt 55° 20 27 45 50 54 

56 20 27 45 49 53 

oe 57 19 26 44 48 52 
ia 58 19 26 43 48 52 
+ & 59 19 25 43 47 51 
~ 60 18 25 42 46 50 
rag 62 18 24 41 45 49 
—— 65 17 23 39 43 47 
oc 5 ll 67 16 22 38 42 46 
a 70 15 21 3 40) 44 
oP ae 75 14 20 34 38.1 41 
ae 80 13 18 32 35.9 39 
ae 85 12 17 3 34.0 37 
+ 90 12 16 2 32.2 35 
4 : 100 10 14 2 29.2 32 
120 

140 
150 

170 
250 
300 

400 
4 


Upper limits (%) 


| 
< 
a 
z 
2 
z 
= 
° 
& 
a 
z 
8 
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N 

< 
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= 
a 
z 
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Lower limits (%) 
P = probability. 


Note.— For mode of use see Examples 8 to 12. M = 1000. N = total number of individuals 


in the sample. 


Percentage of A’s in sample = 50 


| 
.00 10 10 | 025 | .005 
40 29 | 38.9 61.1 66.1 70.4 
43 30 39.3 60.7 65.5 69.6 ; 
47 31, 39.8 60.2 64.8 68.8 ; 
50 31 40.1 59.9 64.3 68.2 
55 32 40.6 59.4 63.7 67.4 
60 33 41.1 58.9 63.1 66.6 : 
65 34 41.4 58.6 62.5 66.0 
70 34 41.8 58.2 62.1 65.4 
75 35 42.1 57.9 61.6 64.9 
80 35 42.3 57.7 61.3 64.4 
85 36 42.6 57.4 60.9 64.0 os 
90 36 42.8 57.2 60.6 63.6 re 
100 37 43.2 56.8 60.0 62.9 
110 37 43.5 56.5 59.6 62.3 
120 38 43.8 56.2 59.1 61.8 . 
130 38 44.1 55.9 58.8 61.3 : 
150 39 44.5 55.5 58.2 60.5 
170 40 44.9 55.1 57.7 59.9 ey 
200 40 45.3 54.7 57.0 59.1 : 
220 41 45.5 54.5 56.7 58.7 % 
250 41 45.8 54.2 56.3 58.1 
300 42 46.2 53.8 55.7 57.4 
350 43 46.5 53.5 55.3 56.9 
400 43 46.7 53.3 55.0 56.4 
500 44 47.1 52.9 54.4 55.8 
600 44 47.3 52.7 54.0 55.3 
700 45 47.5 52.5 53.7 54.9 
1M 45 47.9 52.1 53.1 54.1 
1.5M 46 48.3 51.7 52.5 53.3 
2M 47 48.5 51.5 52.2 52.9 
3M 47 48.8 51.2 51.8 52.4 
5M 48 49.1 50.9 51.4 51.8 
10M 48 49.4 50.6 51.0 51.3 
20M 49 49.5 50.5. 50.7 50.9 j 
50M 49 49.7 50.3 50.4 50.6 hum 
100 M 49 49.8 50.2 50.3 50.4 


Upper limits (%) 


TABLE II—Continued 
CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA— 


NUMBER OF A’s IN SAMPLE: 20 AND OVER—Continued 


z 
8 
= 
= 


Lower limits (%) 
Note.— For mode of use see Examples 8 to 12. M = 1000. N = total number of individuals 


in the sample. P = probability. 


Percentage of A’s in sample = 45 


126 
43 
47 
50 
50 
65 
80 
90 
110 
120 
130 
150 
170 
200 
"250 
300 
350 
500 
700 
1 
1.5 
¥ 3 
10 
20} 
1001 


3 
an 
= 
= 


TABLE I1—Continued 
CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA— 


NUMBER OF A’s IN SAMPLE: 20 AND OVER—Continued 


Upper limits (%) 


Lower limits (%) 


Percentage of A’s in sample = 40 


Note.— For mode of use see Examples 8 to 12. M = 1000. N = total number of individuals 


in the sample. P = probability. 


127 
— 
.10 10 
.025 005 
43 21.7 ; 
47 21.7 25.5 29.8 og 
22.5 26.1 30.3 56.0 
55 26.6 4 55 
23.7 30.6 50 3 59.6 ; 
60 27.2 0 54 : 
24.4 31.1 8 59.0 
27.7 31.5 54.0 58.0 
65 25.0 53.4 57. 
28.2 8 
25.5 28.6 48.7 52.8 
= 6.0 29.0 - 48.3 56.5 
26.4 32.4 52.3 55 
85 29.4 48.0 5 9 
26.8 32.7 1.9 5 hi 
29°7 47.7 5.3 
32.9 47.5 51.5 54.8 . 
90 27.2 51.1 54. , 
100 30.0 
reo 27.8 30.5 as 47.3 50.8 
28.3 30.9 4 46.9 53.9 
28.8 33.8 2 53 
130 31.3 46.5 49 2 
29.2 34.1 7 52 
31.6 34.3 46.2 49.3 
140 29.6 48.9 a's 
30.0 34.5 45.7 
30.6 34.7 48.5 51 
200 32.7 45.5 48 “A 
31.3 35.0 2 50 
250 33.2 45.2 47 34 
32.2 35.4 7 5 
34.0 44.7 0.0 
35.9 47.1 4 
44.2 4 9.2 
300 32.9 34.5 
50 33.4 36.3 43 | 
400 = 34.9 36.6 8 45.7 
500 35.2 43.5 45.3 
ra 4 35.7 43.3 44. 46.9 
35.3 37.1 9 46.4 
36.4 37.6 44.4 
1M 36.1 5 43.7 44.8 asi 
1.5M 36.8 37.0 38.0 : 
2M 5 37.5 : 42.0 43 
37.2 38.4 41 1 44.0 
3M| 37.7 38.2 38.6 43.3 
5M 38.2 41.2 42.8 
42.3 
39.1 39.0 39.4 
50M 30.4 39.3 39.6 et 41.0 41.3 
100 M 39.6 4 40.7 i 
39.6 39.7 40 40.9 
39.7 3 40 
| 
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TABLE IIl—Continued 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA— 
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NUMBER OF A’s IN SAMPLE: 20 AND OVER—Continued 


Lower limits (%) Upper limits (%) 
.005 .025 .025 .005 
Percentage of A's in sample = 35 
50 18.8 22.2 26.0 45.0 49.8 54.1 
55 19.5 22.8 26.4 44.5 49.0 53.1 
60 20.1 23.3 26.8 44.0 48.4 $2.3 
65 20.7 i! 43.6 47.8 51.5 
70 21.2 24.1 27.4 43.3 47.3 50.9 
75 21.6 24.5 a7 .7 43.0 46.8 50.3 
80 22.0 24.8 27.9 42.7 46.4 49.8 
90 22.7 25.4 28.3 42.2 45.7 48.9 
100 23.3 25.9 28.7 41.8 45.1 48.1 
110 23.8 26.3 29.0 41.4 44.6 47.5 
120 24.3 26.6 29.3 41.1 44.2 46.9 
130 24.7 27.0 29.5 40.9 43.8 46.4 
140 25.0 27.2 29.7 40.6 43.5 46.0 
150 25.4 27.5 29.9 40.4 43.1 45.6 
170 25.9 28.0 30.2 40.1 42.6 44.9 
200 26.6 28.5 30.6 39.7 42.0 44.1 
250 27.5 29.2 o1.3 39.1 41.2 43.1 
300 28.1 29.7 31.4 38.8 40.7 42.4 
350 28.6 30.1 3167 38.5 40.2 41.8 
400 29.0 30.4 31.9 38.2 39.9 41.4 
500 29.6 30.9 32.2 37.9 39.3 40.7 
700 30.4 a $2.7 37.4 38.6 39.8 
1M 31.2 32.1 33.0 37.0 38.0 39.0 
1.5M 31.9 32.6 33.4 36.6 37.5 38.2 
2M 32.3 32.9 33.6 36.4 37.1 37.8 
3M 32.8 33.3 33.9 36.1 36.7 37.3 
5M 33.3 es 34.1 35.9 36.3 36.8 
10M 33.8 34.1 34.4 35.6 35.9 36.2 
20M 34.1 34.3 34.6 35.4 35.7 35.9 
50M 34.6 34.7 35.3 35.4 35.6 
100 M 34.6 34.7 34.8 35.2 35.3 35.4 
Note.— For mode of use see Examples 8 to 12. M = 1000. N = total number of individuals 


in the sample. P = 


probability. 
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TABLE I1—Continued 
CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA— 
NUMBERS OF A’s IN SAMPLE: 20 AND OVER—Continued 
Lower limits (%) Upper limits (%) 
.005 .025 .10 .10 .025 .005 
Percentage of A's in sample = 30 
55 15.5 18.5 21.9 39.3 43.8 48.0 
60 16.1 19.0 22.2 38.8 43.2 47.2 
65 16.6 19.4 22.5 38.5 42.6 46.4 
70 17.0 19.8 22.8 38.1 42.1 45.8 
75 17.4 20.1 23.1 37.8 41.6 45.2 
80 17.8 20.4 23.3 37.5 41.2 44.6 
85 18.1 20.7 23.5 37.3 40.9 44.2 
90 18.4 20.9 23.7 37.0 40.5 43.7 
95 18.7 21.1 23.8 36.8 40.2 43.3 
100 19.0 21.4 24.0 36.6 39.9 43.0 
110 19.5 21.7 24.3 36.3 39.4 42.3 
120 19.9 22.1 24.5 36.0 39.0 41.7 
130 20.3 22.4 24.8 35.7 38.6 41.2 
150 20.9 22.9 25.1 35.3 38.0 40.4 
170 21.4 23.3 25.4 35.0 37.4 39.7 
200 22.1 23.8 25.8 34.5 36.8 38.9 
250 22.9 24.5 26.2 34.0 36.1 37.9 
300 23.5 24.9 26.6 33.6 35.5 : 
350 23.9 23.3 26.8 33.4 meee 36.6 
400 24.3 25.6 27.0 33.1 34.7 36.2 
500 24.9 26.1 27.3 32.8 34.2 35.5 
700 25.7 26.7 27.8 a2.a 33.5 34.6 
1M 26.3 27.2 28.1 31.9 32.9 33.8 
1.5M 27.0 Ys 28.5 31.6 32.4 33.1 
2M 27.4 28.0 28.7 31.4 32.1 32.7 
3M 27.9 28.4 28.9 31.1 31:7 32.2 
5M 28.3 28.7 29.2 30.8 31.3 3.7 
OM 28.8 29.1 29.4 30.6 30.9 31.2 
20M 29.2 29.4 29.6 30.4 30.6 30.8 
50M 29.5 29.6 29.7 30.3 30.4 30.5 
100 M 29.6 29.7 29.8 30.2 30.3 30.4 
Percentage of A’s in sample = 25 
65 12.7 15.2 18.0 33.2 37.3 41.1 
70 13.1 15.5 18.3 32.8 36.8 40.4 
75 13.4 15.8 18.5 32.5 36.3 39.8 
80 13.8 16.1 18.7 32.3 35.9 39.3 
85 14.1 16.3 18.9 32.0 35.5 38.8 
90 14.3 16.6 19.1 31.8 35.2 38.4 
100 14.8 17.0 19.4 31.4 34.6 37.6 
110 15.3 17.3 19.7 31.1 34.1 37.0- 
120 15.6 17.6 19.9 30.8 33.7 36.4 
130 16.0 17.9 20.1 30.5 33.3 35.9 
Note.— For mode of use see Examples 8 to 12. M = 1000. N = total number of individuals 


in the sample. P 


= probability. 


a 

i{ 
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TABLE Il—Continued 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA— 
NUMBER OF A’s IN SAMPLE: 20 AND OVER—Continued 


Lower limits (%) Upper limits (%) 


-005 -025 -10 -10 -025 -005 


Percentage of A’s in sample = 25 


B 


CORRE 


4. 30.4 33.8 
85 10.2 12.2 14.5 26.7 30.1 33.3 
90 10.4 12.4 14.6 26.4 29.7 32.8 
100 10.9 12.7 14.9 26.1 29.2 32.1 
110 11.2 13.1 15.1 25.7 28.7 31.5 
120 11.6 13.3 15.3 25.5 28.3 30.9 
130 23.9 13.6 15.5 25.2 27.9 30.4 
150 12.4 14.0 15.8 24.8 27.3 29.6 
170 12.8 14.3 16.1 24.5 26.8 29.0 
200 13.3 14.7 16.4 24.1 26.2 28.2 
250 14.0 15.3 16.8 23.6 25.5 27.2 
300 14.4 15.7 17.0 23.3 25.0 26.5 
350 14.8 ° 16.0 17.3 23.0 24.6 26.0 
400 15.1 16.2 17.4 22.8 24.3 25.6 
500 15.6 16.6 17.7 22.5 23.8 25.0 
700 16.3 17.1 18.1 22.1 23.2 24.2 
1M 16.9 17.6 18.4 21.7 22.6 23.4 
1.5M 17.4 18.0 18.7 21.4 22.1 22.8 
2M 17.8 18.3 18.9 21.2 21.8 22.4 
3M 18.2 18.6 19.1 21.0 21.5 21.9 
5M 18.6 18.9 19.3 20.7 21.1 21.5 
10M 
20M 
50M 
100 M 


Note.— For mode of use see Examples 8 to 12. 
in the sample. P = probability. 


N = total number of individuals 


130 
os 150 16.5 18.4 20.4 30.1 32.7 
. 170 17.0 18.8 20.7 29.8 32.2 
a. 200 17.6 19.2 21.0 29.3 31.6 
ed 250 18.3 19.8 21.5 28.8 30.8 
‘i 300 18.9 20.3 21.8 28.5 30.3 
aa 350 19.3 20.6 22.0 28.2 29.9 
mid 400 19.7 20.9 22.2 28.0 29.5 
a. 500 20.2 21.3 22.5 27.7 29.0 
ae, 700 20.9 21.9 22.9 27.2 28.4 
eh 1M 21.6 22.4 23.2 26.8 27.8 
ek 1.5M 22.2 22.8 23.6 26.5 27.3 
ae 2M 22.6 23.1 23.8 26.3 27.0 27 
a 3M 23.0 23.5 24.0 26.0 26.6 27 
al 5M 23.4 23.8 24.2 25.8 26.2 26 
a? 10M 23.9 24.2 24.4 25.6 25.9 26 
aa: 20M 24.2 24.4 24.6 25.4 25.6 25 
Pests 50M 24.5 24.6 24.8 25.2 25.4 25 
ane. 100 M 24.6 24.7 24.8 25.2 25.3 25 
a ‘ Percentage of A’s in sample = 20 
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TABLE II—Continued 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA— 
NUMBER OF A’s IN SAMPLE: 20 AND OVER—Continued 


Lower limits (%) 


Upper limits (%) 


-005 


-025 -10 


-005 


Percentage of A’s in sample = 15 


So SOOO 


oro 


Percentage of 


A's in sample 


SS85 


NONUN 


Note.—For mode of use s 
in the sample. 


P = probabil 


ce Examples 8 to 12. 
ity. 


M = 1000. N = total number of individuals 
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7 
100 10.5 20.6 23.5 26.3 
110 10.7 20.3 23.1 25.7 
120 10.9 20.0 22.7 25.2 
130 11.0 19.8 22.3 24.7 
150 11.3 19.4 21.7 23.9 
170 1 11.5 19.1 21.3 23.3 e 
200 1 11.8 18.8 20.7 22.6 ee 
250 1 12.1 18.3 20.0 21.7 | 
300 1 11.2 12.4 18.0 19.6 21.0 
350 1 11.5 12.6 17.7 19.2 20.5 aa 
400 1 11.7 12.7 17.5 18.9 20.1 : 
500 11.2 12.0 13.0 17.3 18.4 19.5 
700 11.7 12.5 13.3 16.9 17.9 18.8 
1M 12.2 12.9 13.6 16.6 17.4 18.1 x 
1.5M 12.7 13.2 13.8 16.3 16.9 17.5 } 
2M 13.0 13.5 14.0 16.1 16.6 17.2 ey 
3M 13.4 13.7 14.2 15.9 16.3 16.8 . 
5M 13.7 14.0 14.4 15.7 16.0 16.3 
10M 14.1 14.3 14.5 15.5 15.7 15.9 
20M 14.5 14.7 15.3 15.5 15.7 | 
50M 14.6 14.7 14.8 15.2 15.3 15.4 ae 
100 M 14.7 14.8 14.9 15.1 15.2 15.3 + 
= 10 
130 5. 14.2 16.5 18.7 - 
150 5. 13.9 16.0 18.0 
170 6. 13.6 15.5 17.4 
200 6. 12.3 15.0 16.7 
250 on 12.9 14.4 15.9 
300 6. 12.6 14.0 15.3 _ 
350 7. 12.4 13.6 14.8 
400 7. 12.2 13.4 14.5 
500 7. 11.9 13.0 13.9 
700 7. 11.6 12.5 13.3 | 
8. 11.3 12.0 12.7 : 
1 8. 111 11.6 12.2 
8. 10.9 11.4 11.9 
9. 10.7 11.1 11.5 
on. 10.6 10.9 11.1 
9. 10.4 10.6 10.8 
9. 10.3 10.4 10.6 
9. 10.2 10.3 10.4 
9. 10.1 10.2 10.2 


a 
nj 
S 
< 
z 


TABLE II—Continued 
CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA— 


NUMBER OF A's IN SAMPLE: 20 AND OVER—Continued 


Upper limits (%) 


Lower limits (%) 


Percentage of A's in sample = 7.5 


Percentage of A’s in sample = 5 


ples8to12. M = 1000. N = total number of individuals 


of use see Exam 


= probability. 


Note.—For mode o 
in the sample. P 
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P .005 | .025 | 10 .10 | .025 | .005 
oa 200 3.5 4.3 5.2 10.5 12.1 13.€ 
ee 250 3.9 4.6 5.4 10.1 11.5 12.9 
a 300 4.1 4.8 5.6 9.8 11.1 12.3 
ae 350 4.3 5.0 5.7 9.6 10.8 11.9 
os 400 4.5 5.1 5.9 9.5 10.5 11.6 
oe 500 4.8 5.4 6.0 9.2 10.2 11.1 
As 700 5.2 5.7 6.2 8.9 9.7 10.4 
i iM 5.5 6.0 6.4 8.7 9.3 9.9 
oe 1.5M 5.9 6.2 6.6 8.5 9.0 9.4 
A a 2M 6.1 6.4 6.8 8.3 8.7 9.1 
4 3M 6.3 6.6 6.9 8.2 8.5 8.8 
+ 5M 6.6 6.8 7.0 8.0 8.3 8.5 
Lge 10M 6.8 7.0 7.2 7.8 8.0 8.2 
oe 20 M 7.0 7.1 7.3 7.7 7.9 8.0 
= 50M 7.2 7.3 7.3 7.7 7.7 7.8 
mee. 100 M 7.3 7.3 7.4 7.6 7.7 7.7 
7 240 2.1 2.6 3.3 7.3 8.6 9.8 
i% 260 2.2 2.7 3.4 7.2 8.4 9.6 
Be 4 300 2.3 2.8 3.5 7.0 8.1 9.2 
ee 350 2.5 3.0 3.6 6.8 7.8 8.8 
ae 400 2.6 3.1 3.7 6.7 7.6 8.5 
oa 450 $9 3.2 3.7 6.6 7.4 8.3 
. 500 2.8 3.3 3.8 6.5 7.3 8.1 
ee 600 3.0. 3.4 3.9 6.3 71 7.8 
al 700 3.1 3.5 4.0 6.2 6.9 7.5 
a. 800 3.2 3.6 4.0 6.1 6.8 7.3 
ee 1M 3.4 3.7 4.1 6.0 6.5 7.1 
a 1.5M 3.7 4.0 4.3 5.8 6.2 6.6 
ge 2M 3.8 4.1 4.4 5.7 6.1 6.4 
- 2 3M 4.0 4.2 4.5 5.6 5.8 6.1 
eee 5M 4.2 4.4 4.6 5.4 5.6 5.9 
aan 10M 4.5 4.6 4.7 5.3 5.4 5.6 
pial 20M 4.6 4.7 4.8 5.2 5.3 5.4 
oe: 50 M 4.8 4.8 4.9 5.1 5.2 5.3 
ee 100 M 4.8 4.9 4.9 5.1 5.1 5.2 


AN 


Upper limits (%) 


} 
= 
< 


< 
A 
Zz 
= 
4 
8 Zz 
= 
a3 
2 
& 
= 
8 


Lower limits (%) 


: 
| 
> 
° 
a 
Zz 
< 
a 
a 
= 


Note.—- For mode of use see Examples 8 to 12. M = 1000. N = total number of individuals 


in the sample. P = probability. 


Percentage of A’s in sample = 3 
Percentage of A’s in sample = 2 
Percentage of A’s in sample = 1 


| 
005 | 025 | 10 | .025 | 005 
2.0 4.4 5.2 5.9 
21 4.2 4.9 5.6 
21 4.1 4.7 5.3 
2.2 4.0 4.6 5.1 
3.9 4.4 4.9 
2.3 3.8 4.3 4.7 
2.5 3.6 4.0 4.3 
2.5 3.6 3.8 
2.6 3.4 3.7 3.9 | 
2.7 3.3 3.5 3.7 
2.8 3.2 3.4 3.5 | 
2.8 3.2 3.2 3.3 
2.9 3.1 3.2 3.2 | 
2.9 3.1 3.1 3.1 
1.3 3.0 3.5 4.0 : 
1.4 2.9 3.3 3.8 ; 
14 3.2 3.7 
15 2.7 3.1 3.4 
1.6 2.8 3.1 
1.6 2.5 2.7 
1.7 2.4 2.6 
1.8 
1.8 2.2 
1.9 21 2.2 
1.9 2.1 2] 
1.9 21 2. 
1M 37 48 62 1.5 1.8 2.1 \ 
1.5M ‘46 56 69 1.4 1.6 1.9 
2M "52 61 "73 14 es 1.7 
3M 39 68 78 13 14 1.6 
5M 67 "14 "92 1.2 1.3 14 
7M 72 78 85 1.2 1.3 1.4 
10M 16 ‘81 87 11 1.2 1.3 
20 M "83 "87 ‘O1 11 11 1.2 
50M "89 94 11 11 11 
70 M ‘91 "93 95 11 11 11 
100 M 92 94 6 1.0 1.1 1.1 
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TABLE Il—Concluded 


CONFIDENCE LIMITS FOR TWOFOLD CLASSIFICATION OF ENUMERATION DATA— 
NUMBER OF A’s IN SAMPLE: 20 AND OVER—Condl 


Lower limits (%) Upper limits (%) 


NP -005 -025 -025 


Percentage of A’s in sample = 0.7 


wn 
= 


Percentage of A's in sample = 


2.5M 
3M 


SS5 Sram 


Note.— For mode of use see Examples 8 to 12. M = 1000. N = total number of individuals 
in the sample. P = probability. 


134 
ay 
31 38 AT 1.0 1.2 1.3 
34 ‘41 ‘50 ‘97 1.1 1.3 
43 ‘51 (94 1.1 1.2 
‘41 ‘47 54 ‘90 1.0 1.1 
43 49 ‘55 88 ‘97 1.1 
47 .85 .93 1.0 
50 ‘55 ‘60 82 88 
56 59 63 ‘78 83 
‘61 63 ‘65 ‘75 ‘78 
100 M 63 65 ‘67 174 ‘75 
05 
23 28 34 1 194 
4M ‘31 36 ‘68 ‘77 
5M 28 32 65 ‘74 82 
6M ‘34 39 64 71 ‘79 
7M 31 .35 .40 .63 .70 
10M ‘34 ‘41 ‘60 66 ‘1 
20M ‘38 ‘41 44 ‘57 61 64 
50M 42 44 ‘54 57 ‘59 
100 M 44 ‘46 ‘47 53 ‘55 56 
Percentage of A's in sample = 0.3 
.16 .20 .44 .52 .60 
‘14 17 ‘21 43 56 
‘15 ‘18 21 ‘41 ‘47 53 
16 ‘19 22 ‘40 46 ‘51 
18 ‘20 38 ‘43 ‘47 
21 25 36 .39 42 
26 ‘34 37 ‘39 
25 27 33 35 
70 M 25 26 27 ‘33 34 
100 M 127 28 32 34 
is a Percentage of A’s in sample = 0.1 
6M 026 033 053 18 22 26 
7M ‘029 "040 (056 ‘17 ‘21 24 
10M 048 ‘15 18 “21 
20 M 061 14 ‘15 ‘17 
30M 068 ‘077 13 ‘14 16 
50M .067 .074 12 13 14 
70M ‘12 ‘13 14 
100 M 081 ‘il ‘12 ‘13 
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TABLE III 


CORRECTION TERMS FOR ESTIMATION OF CONFIDENCE LIMITS—NUMBER OF A’s IN SAMPLE 
GREATER THAN 20; PERCENTAGE OF A's: 10 OR LESS 


Lower limits Upper limits 
.025 


9; 
9. 
9 
8 
8 
8 
8 
8 
7. 
6. 
6. 
6. 
6 
6 
5 
5 
5 
5. 
5. 
4 
4 
4. 
4. 
4. 
3. 
3. 
3. 
3. 
3 
2 
2 
2 
2 
2 
1 
1 
-1 
1 
1 
0 
0 
0 
0 


6 
4 
2 
6 
‘4 
2 
0 
8 
6 
4 
2 
0 
8 
6 
4 
2 
0 
‘8 
6 
4 
2 
0 
6 
4 
2 
0 
8 
6 
4 
2 
0 
‘8 
6 
4 
2 
0 
‘8 
6 
4 
2 
0 
‘8 
.6 
4 
2 
0-0 


__ Note.—Produced by interpolation in Table VIII1 of Fisher and Yates's Statistical Tables, 
= permission of the publishers, Messrs. Oliver and Boyd. For mode of use see Example 12. 
= probability. 


A's in 
sample, % | p 005 005 
10.0 3 14 1.13 73 2.47 
9.8 4 "80 14 1.13 2.48 
4 ‘81 14 1.14 2.50 
5 ‘81 45 1.14 16 2°51 
6 "82 "45 1.15 6 2.53 
6 AS 1.15 7 2.54 a 
7 * 1.15 8 2.56 
8 16 1.16 9 2.57 
8 16 1.16 B0 2.59 
9 16 1.17 Bi 2.60 
1 
2 
3 
5 
5 
6 
7 
7 a 
8 
9 
9 
70 
71 
72 
73 
73 
74 
75 
75 
76 
77 3 
17 
78 
79 q 
80 
80 
81 
82 
82 
83 
84 
84 
85 
86 "98 26 1.32 13 3.16 
86 ‘99 26 1.33 14 3.18 4 
87 .99 26 1-33 15 3-19 
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TABLE IV 4 


PROBABILITIES FOR FOURFOLD CONTINGENCY TABLES— 
EQUAL SAMPLES UP TO N = 20 


Sample No. » Sample No. Sample No. 
(1) (2) | a) (2) (1) (2) 
728 .0003 0:9 4:5 .0412 
6:1 .0023 3:6 . 1030 
.0105 .2353 
.0350 .5000 
3:4 0962 8:1 .0017 
2:§ .2308 ata 
736 .5000 .0249 
= 0146 5:4 .0656 
0:3 0500 -0513 435 .1471 
2.2% .2000 4:3 .1329 2882 
.5000 .2797 .5000 
.5000 .5000 1:8 .7647 
.8000 £236 . 7692 227 .0283 
N=4 .1431 6:3 .0767 
0:4 4:0 0143 4:3 .2960 ote .1674 
0714 3:4 .5000 .3100 
.2143 .7203 336 .5000 
.5000 4:3 .5000 aie .7118 
.5000 .3186 
.7857 |IN =8 .5000 
Ris 7571 0:8 8:0 .0001 3:6 .6900 
| .0007 5:4 .5000 
N = 5 .0035 4:3 .6814 
.0238 4:4 .0385 = 10 
ota .0833 . 1000 0:10 | 10:0 .0000 
.2222 236 . 2333 
.5000 .5000 .0004 
1:4 | .1032 27 .0051 .0015 
.2619 6<2 .0203 6:4 -0054 
.5000 5i3 .0594 .0163 
1:4 .7778 4:4 .1410 4:6 .0433 
233 .5000 .2846 -1053 
.7381 .5000 228 .2369 
Be . 7667 1:9 .5000 
N =6 2:6 .0660 .0005 
0:6 6:0 .1573 .0027 
5:1 |. .0076 4:4 .3042 Ee .0099 
432 .0303 .5000 6:4 .0286 
.0909 236 .7154 .0704 
2:4 .2273 $35 5.23 .3096 4:6 .1517 
.5000 4:4 .5000 .2910 
135 | .0400 2:8 .5000 
$32 .1212 4:4 4:4 .6904 7632 
a3 .2727 2:8 
.5000 |IN =9 
EER 7727 0:9 9:0 .0000 6:4 
4:2 2836 .0002 
.5000 722 .0011 4:6 
.7273 6:3 .0045 as? 
ats 7165 S24 .0147 


Note.—For mode of use see Example 14. N = total number of individuals in each sample. 
P = probability. 
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TABLE IV—Continued 


PROBABILITIES FOR FOURFOLD CONTINGENCY TABLES— 
EQUAL SAMPLES UP TO N = 20—Continued 


Sample No. Sample No. 
(2) (1) (2) 


00 CON 


— 


COP REN 


0 
1 


— 


- 


0 
2 
3 
4 
5 
6 
8 
: 10 
37 
:8 
:9 
1 
1 
2 
3 
2:4 
: 6 
:8 
: 10 
33 
:6 
:7 
:8 
9 
4 
5" 
6 
78 
5 
36 
37 
:6 


DUD TOW NW 
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Sample No. 
P P 
(1) (2) 
N = 10 N = 12 WARE 
:4 .0002 
.0008 
:6 .0026 
.0075 
.0196 
4:6 :9 .0478 
: 10 .1100 
31% . 2400 4 
.5000 
.0000 
N = ll .0001 
0:11 .0005 
.0018 
.0056 
: .0047 0151 
: -0136 
.0062 .0343 
.0175 .0775 1609 
:7 | .0451 .1584 
:8 | .1072 :2950 -2965 
.2381 .5000 .5000 
: 10 .5000 . 7609 -7600 
£2301 181i .0002 .0017 22h .0006 
.0010 -0061 .0024 
.0037 .0180 .0077 
.0119 .0447 .0207 
.0317 .0965 .0484 
:6 .0743 .1854 . 1008 
.1554 .3202 .1891 
.2932 .5000 .3224 
:9 .5000 .7050 .5000 : 
: 10 .7619 3:9 .0196 .7035 
2:9 .0045 .0498 3:10 .0085 
-0150 . 1069 .0236 
.0402 . 2002 .0554 
:6 . 1807 .5000 .2055 
.3176 .6798 .3364 
:8 .5000 4:8 .1102 .5000 
:9 . 7068 . 2068 .6776 i 
3:8 -0431 .3401 4:9 .0576 
74 .0992 .5000 .1189 
.1935 -6666 .2142 
$6 .3297 .3421 .3441 
.5000 .5000 .5000 
.6824 .6599 .6636 
4:7 :4 .1974 6:6 .6579 $:8 .2169 
.3350 .3475 
:6 .5000 |IN = 13 .5000 
.6703 .0000 .6559 
5:6 .5000 .0000 6:7 .5000 
.6650 10:3 .0001 .6524 
4 
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TABLE 1V—Continued 


VOL. 26, SEC. E. 


PROBABILITIES FOR FOURFOLD CONTINGENCY TABLES— 


EQUAL SAMPLES UP TO N = 20—Continued 


Sample No. 


(1) (2) 


NID 0 


NHN WHE UD =1000 


NW 


9: 

8: 

Sz 

a¢ 

4: 

.5000 

-6482 

-6468 10: 

9: 

8: 

70 .0000 7% 
2:4 .0000 6 
2:5 .0001 5 
.0003 4 
7 .0011 5:10] 10 
8 -0032 9 
9 .0084 8 
10 .0211 7 
1 .0498 6 
12 .1121 5 
13 .2414 6:9 9 
14 .5000 8 
.0000 7 
.0000 6 
.0001 7:8 


CONDO CONTA 


OONIAUO 


on 

ard 


o 


138 
Sample No. Sample No. | 
a (1) (2) (1) (2) 
N = 14 N = 14 N = 15 
Ee 0:14] 14:0 .0000 5:9 6 .0302 
Clee 11:3 .0000 7 .0641 
10:4 .0001 .1225 
449 9:5 | .0003 9 | .2135 
8:6 | .0010 10 | .3408 
737 .0029 6:8 11} .5000 
6:8 .0080 12 | .6743 
5:9 .0204 4 .0134 
i 4:10] .0489 737 .0328 
3:11} .1111 .0697 
2:12] .2407 |IN = 15 .1318 
1:13] .5000 0:15 
1:13 | 13:1 .0000 .3499 
as 12:2 .0000 10 .5000 
11:3 .0002 9 6592 
10:4 .0007 .0716 
.0022 7 .1362 
.0064 6 .2311 
.0164 5 .3552 
78 .0384 4 .5000 
79 .0824 3 
| .1630 2 -2331 
.2978 1 .3576 
:12] .5000 1:14] 14 .5000 
213 | .7593 13 
2:12 .0002 12 .5000 
.0009 11 6424 
74 .0032 10: 
-0092 9: N = 16 
76 .0230 8: : .0000 
.0516 : -0000 
78 . 1043 6: : .0001 
79 .1923 : .0004 
:10| .3242 . : .0012 
3:11 .0035 1:14] . | .0217 
: .0107 2:13 | 13:2 . :12 | .0506 
- 0271 12,: 3 | .1129 
: .0601 11:4 .2419 
: .1182 10:5 :15 | .5000 
: .2099 9:6 ‘ .0000 
: .3388 8:7 33 .0000 
: .5000 74 -0001 
: .6758 6:9 .0003 
4:10 : .0285 . :6 .0010 
: .0642 4:11] . :7 .0030 
: .1259 :8 .0077 
.2200 2:13]. :9 | 0186 
: .3473 3:12] 12:3 :10 | .0415 
: .5000 11:4 :11 | .0860 
6612 10:5 212] .1663 
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TABLE I[V—Continued 


PROBABILITIES FOR FOURFOLD CONTINGENCY TABLES— 
EQUAL SAMPLES UP TO N = 20—Continued 


SUD 
i=) 


{ 4 
| 
139 
Sample No. Sample No. Sample No. 
a) | @Q) a) | @) a) | @) 
N = 16 N = 17 N = 17 
3:13 . 2998 0:17 117 -0000 4:13 | 13:4 -0026 
2:14 .5000 11:6 -0000 .0075 
4:15 .7581 10:7 -0001 11:6 .0183 
2:14] 14:2 .0000 9:8 .0005 10:7 .0399 
13:3 .0001 8:9 -0013 9:8 .0785 
12:4 .0005 7:10 .0036 8:9 .1409 
33:33 .0016 6:11 .0092 7:10 .2323 
10:6 .0046 -0222 6:11 -3540 
9:7 .0117 4:13 -0513 $333 .5000 
8:8 .0269 3:14 .1136 4:13 
7:9 .0567 .2424 43:5 -0190 
6:10 .1100 1: 16 .5000 11:6 .0422 
5's .1972 1:16] 16:1 .0000 10:7 -0832 
4:12 .3270 13:4 .0000 78 . 1480 
3:13 .5000 12:5 .0001 . 2406 
2:14 7002 13:6 .0004 : 10 .3603 
3:13 | 13:3 .0005 10:7 -0012 11 .5000 
12:4 .0019 9:8 .0033 .6460 
11:5 .0057 8:9 .0083 6:11 : 6 .0847 
10:6 .0145 -0195 s@ .1514 
.0329 6:11 .0427 78 2453 ; 
8:8 .0675 .0874 .3641 
7:9 . 1262 4:13 1676 : 10 .5000 
6:10 .2166 3:14 .3006 .6397 
.3425 2:85 .5000 7:10 . 2468 
4:12 .5000 1:16 .7576 .3659 
3:13 .6730 1332 .0000 .5000 
1 12:4 .0061 14:3 .0000 : 10 .6358 
11:5 .0160 13:4 .0002 8:9 78 .5000 
10:6 .0366 2:5 .0006 -6341 
.0744 11:6 .0019 
. 1367 10:7 -0052 = 18 
.2289 -0128 0:18 | 18:0 .0000 
.3521 8:9 .0285 12:6 -0000 
.5000 7:10 .0588 -0001 
.6574 -1123 10:8 -0002 
.0378 $312 .1992 9:9 .0005 
1 .0778 4:13 .3281 8:10 .0014 
. 1426 3:14 .5000 7:11 .0038 
.2363 .6994 6:12 -0095 
.3580 3:14] 14:3 -0002 .0227 
.5000 13:4 .0008 4:14 .0519 
.6479 12:5 -0024 3:15 .1143 
6:10} 1 . 1445 11:6 .0067 2:16 .2429 
.2397 10:7 .0162 .5000 
.3612 9:8 .0354 -0000 
.5000 8:9 .0705 13 .0000 
.6420 7:10 .1294 12:6 -0001 i 
739 .3622 6:11 .2192 2.37 -0005 
. 5000 .3440 10:8 .0014 
-6388 4:13 .5000 9:9 .0036 
8:8 -6378 3:14 .6719 8:10 -0089 
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TABLE IV—Continued 


PROBABILITIES FOR FOURFOLD CONTINGENCY TABLES— 
EQUAL SAMPLES UP TO N = 20—Continued 


Sample No. Sample No. 
(1) (2) 


13 
12 
11 
10 
9 
8 
7 
6 
5 
4 
3 
2 
16 
15 
14 
13 
12 
11 
10 
9 
8 
7 
6 
5 
4 
as 
14: 
13 
12 
11 
10 
9 
8 
7 
6 
5 
4 
14 
13 
12 
11 
10 
9 
8 
7 
6 
5 
13 
12 
11 
10 


Sample No. 
P P 
= (1) (2) 
N = 18 N = 18 N = 19 
ee 1:17] 7:11] .0204 5:13] 7:11] .3623 2:17 6 .0003 
‘im 6:12] .0438 6:12] .5000 7 .0009 
5:13 | .0887 5:13 | .6444 8 .0026 
a 4:14| .1688 6:12] 12:6 .0472 9 .0064 
3:15} .3013 11:7 .0906 10 | .0146¢ 
2:16] .5000 10:8 .1573 11} .0312 
1:17 | .7571 9:9 .2499 12 | .0622 
2:16 | 16:2 .0000 8:10] .3666 13 | .1160 
15:3 .0000 7:11] .5000 14} .2024 
14:4 .0001 6:12] .6377 15 | .3299 
13:5 .0002 7:11] 11:7 .1588 16 | .5000 
12:6 .0008 10:8 .2526 17 | .6981 
a 11:7 .0023 9:9 .3690 3:16 3 .0000 
a 10:8 .0058 8:10] .5000 4 .0001 
9:9 .0137 7:11 | .6334 5 .0004 
8:10] .0300 8:10 | 10:8 6 .0013 
7:11 | .0606 9:9 .5000 7 .0035 
6:12] .1142 8:10] .6310 8 .0085 
5:13 | .2009 9:9 9:9 .6302 9 
4:14] .3291 10 | .0394 
3:15] .5000 |IN = 19 11} .0755 
2:16] .6987 0:19 | 19:0 .0000 12} .1345 
3:15] 15:3 12:7 .0000 13| .2235 
14:4 .0003 11:8 .0001 14| .3464 
13:5 .0010 10:9 .0002 15 | .5000 
12 :6 .0030 9:10} .0006 16 | .6701 
11:7 .0076 8:11} .0015 4:15 4 .0005 
10:8 .0177 7:12] .0040 5 
9:9 .0375 6:13 | .0098 6 .0041 
8:10} .0732 5:14] .0232 7 .0101 
7:11 | .1321 4:15 | .0525 8 .0224 
6:12| .2215 3:16] .1149 9 .0455 
5:13 | .3453 2:17] .2432 10 | .0852 
4:14] .5000 1:18 | .5000 11 | .1476 
3:15 | .6709 1:18] 18:1 .0000 12} .2378 
4:14] 14:4 0011 14:5 .0000 13 | .3570 
13:5 13 : 6 .0001 14} .5000 
12:6 .0088 .0002 15 | .6536 
11:7 «0205 11:8 .0005 5:14 5 .0043 
10:8 10:9 .0015 6 .0109 
9:9 .0821 9:10| .0039 7 .0244 
8:10] .1445 8:11] .0094 8 .0496 
7:11] .2353 7:12] .0211 .0918 
6:12) .3556 6:13] .0448 | .1566 
5:13 | .5000 5:14] .0899 | .2475 
4:14] .6547 4:15] .1699 | .3640 
5:13 | 13:5 .0092 3:16| .3019 .5000 
12:6 .0219 2:17 | .5000 14 | .6430 
11:7 .0461 1:18] .7568 6:13 6 .0251 
10:8 .0878 2:171|17:2 .0000 .0516 
9:9 .1526 15:4 .0000 8 .0957 
8:10] .2443 14:5 .0001 :9 .1623 
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Sample No Sample No. Sample No 
(1) (2) (1) (2) | a) (2) 
N =19 N = 20 
6:13 | 9:10] .2538 1:19] 4:16] .1708 4:16] 8:12| .1504 
8:11] .3687 3:17] .3025 7:13 | .2401 
7:12] .5000 2:18} .5000 6:14] .3582 
6:13 | .6359 1:19] .7564 5:15 | .5000 
7:12 | 12:7 :0969 2:18 | 18:2 -0000 4:16| .6526 
11:8 ‘1650 15:5 :0000 5:15] 15:5 :0019 
10:9 14:6 | .0001 14:6 
9:10) .3716 13:7 -0004 13:7 (0124 
8:11 | .5000 12:8 12:8 
7:12 | .6313 11:9 ‘0029 11:9 (0527 
8:11] 11:8 2586 10:10] .0069 10:10] .0954 
10:9 :3729 9:11] .0155 9:11] .1601 
9:10} .5000 8:12} .0324 8:12] .2503 
8:11] .6284 7:13 | .0637 7:13 | .3655 
9:10] 10:9 -5000 6:14] .1176 6:14] .5000 
9:10] .6271 5:15 | .2037 5:15 | .6418 
4:16] .3307 6:14] 14:6 | .0128 
N = 20 3:17] .5000 13:7 
0:20] 20:0 .0000 2:18] .6975 12:8 
12:8 :0000 3:17|17:3 :0000 11:9 
11:9 :0001 16:4 “0000 10:10] .1666 
10:10] .0002 15:5 9:11] .2573 
9:11] .0006 14:6 | .0005 8:12] .3705 
8:12] .0016 13:7 7:13 | .5000 
7:13 | .0042 12:8 6:14| .6344 
6:14| .0101 11:9 7:13 | 13:7 0564 
5:15 | .0236 10:10| .0204 12:8 
4:16] .0530 9:11] .0412 11:9 ‘1703 
3:17 | .1154 8:12 | .0776 10:10] .2615 
2:18| .2436 7:13 | .1367 9:11] .3738 
1:19] .5000 6:14] .2253 8:12] .5000 
.0000 .3474 7:13 .6295 
14:6 -0000 4:16| .5000 8:12] 12:8 1715 
13:7 3:17] .6693 11:9 2636 
12:8 -0002 4:16] 16:4 10:10| .3756 
11:9 -0006 15:5 9:11] .5000 
10:10} .0017 14:6 | .0018 8:12] .6262 
9:11] .0042 13:7 0048 9:11] 11:9 
8:12] .0098 12:8 (0112 10:10} .5000 
:13 | .0218 11:9 9:11 | .6244 
6:14] .0457 10:10] .0479 || 10:10] 10:10| .6238 
5:15} .0909 9:11] .0880 


| 
th. 
— 
7 
ie 
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TABLE V 


SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
UNEQUAL SAMPLES UP TO N; = 20, N: = 19 


Smaller sample (N2)— 
minimum differences 


Smaller sample (N2)— 
minimum differences 


Neo 


:8 
2:6 


Pw 


2 (.0210) 
: 0 (.0163) 


: 0 (.0101) 


: 0 (.0222) 


0 (.0061) 


0 (.0242) 


182) 


se 00 


: 10 
:9 
:8 


N.= 3 
3 : 0 (.0035) 


N2= 4 
4:0 (.0010) 
4:0(.0050—) 


ar High! ‘oni High! 
si Significant (Ni) Significant 
gnificant significant 
Ni= 5 N.= 3 Ni= 8 7 
:5 3:0 (.0179) 0:8 6:1 5:2 
0:5 4:0(.0079) || 3:5 7200187 
0:6 3:0(.0119) | 0:9 2 : 0 (.0182) 
N.= 4 a= 3 
0:6 4:0 (.0048 -- 0:9 3 : 0 (.0045) we 
1:5 (.0238) || 1:8 3:0 (.0182) 
= =4 
0:6 (0022) 4:1 0:9 4:0 (.0014) | 3:1 (.0140 
1:5 5 : 0 (.0130 
Ni= 7 N:= 3 
4 1:8 $20 (0030) 4:1 (.0230) 
0:7 4:0 (.0030) | 3:1 2:7 — 5 : 0 (.0105) 
1:6 — 4: 0 (.0152) 
2= 
2= 5 0:9 : 4 : 
0:7 5 : 0 (.0013) $:3(-oe 1:8 5:1. 
236 5 : 0 (.0076) 


: 0 (.0140) 


: 1 (.0110) 
: 0 (.0150) 


Note.—For mode of use see Examples 15 to 18. P = probability (in parentheses). 


| 
142 

= tee 1:6 6 : 0 (.0040 0: 5 : 2 (.0048) : 3 (.0192) 
N= 8 2 3: 60087) | Cotos) 
0:8 2 
3 0 72 (.0023 3 (.0090) 
1:7 8 : 0 (.0019) : 1 (.0123) 
3: : 0 (.0067) 
Ni= 4 4: : 0 (.0204) 
0:8 4:0 (.0020) | | 
0: 0 (.0152) 
N.= 5 

| : 0 (.004 : 

6 

5:1 4: 2 (.0150) 0 3 

6 : 0 (.0023) i 
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TABLE V—Continued 


SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
UNEQUAL SAMPLES UP TO N; = 20, Ng = 19—Continued 


Smaller sample (N:)— 
minimum differences 


Highly 


significant 


(P < .005) 


Significant 
(P < .025) 


minimum 


Smaller sample (N:)— 
ifferences 


Significant 
< .025) 


(P 


OUP O 


3: 1 (.0088) 
: 0 (.0110) 


5:0 


: 0 (.0110) 


(.0179) 


0128) 


(.0128) 


UP 


© 


CONT OA 


CORR 


: 1 (.0071) 
:0 
.0192) 


x 
143 
sampie sampie Hi hl ; 
(Ni) | (Ni) 
gnificant 
N.= 10 N.= 5 Ni= 11 N:= 6 
0:10 4: 1 3:2 9219) 0:11 5:1 (.0010 (0063 
_ ‘ $ .01 
3:7 5 : 0 (.0187) 3:8 (-0067 
oy 
6 
:1 :1(. : 2 (.0082 N:= 7 
:i1 $:2(.0025) (0114 
6:0(. ) 1 (. ) : 
2:8 6 : 0 (.0035) iy : 10 6: 1 (.0024 (.0129 i 
Nee? 7 (0104 
5 :2(.0034) | (.0147) - (.0249 
ES | 00193 : 2 (.0176) 8 
7:0 (.0019 : 1 ¢.0134) 
0 (.0062) :10 7:1 (0012 : (0237 
: 0 (.0170) 9 7:1 (.0049 : (0216 
N= 8 +: 8: 0 (.0022 : 
8 : 0 (.0010) 9 
8 : 0 (.0038) 6 : 3 (.0022 :4 
: 0 (.0113 : 10 7:2 ¢.0032 : 3 (.0124 
4 ~ .01 = 
:6 = : 0 (0077) 6 : 4 (.0039) 5 (0124) 
: 0 (0217) : 10 8:2 (.0017 : 4 (0209) 
N:=11 2 10:0(. $2¢. 
: = 27 10 : 0 (.0028 1 (.0170) 
0:11 2 : 0 (.0128) 6 AS (0085) 
: 70 (. Ni=12 |N,=2 
1:10 1 2 : 0 (.0110) 
N.= 4 N= 3 
0:11, 4 : 0 (.0007) 0:12 3 : 0 (.0022) — : 
5 N.= 4 
0:11 4:1(.0027) | 3 4: 0 (0008) 3 
1:10 5:0 4 4:0 (.0027 
2:9 5 : 0 (.0048) : 10 = 4 } 
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SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
UNEQUAL SAMPLES UP TO N; = 20, Ne = 19—Continued 


all ple (N2)— 
minimum differences 


Smaller sam 


Significant 
(P < .025) 


10 : 1 (.0022) 
11 : 0 (.0010) 
11 : 0 (.0032) 


ple (N2)— 
minimum differences 


Smaller sam 


2 : 0 (.0095) 


10 : 0 (.0124) 


| 
144 
4 
| Highly ‘ 
Larger Highly (P < .005) 
1 
(Ni -005) N= 11 8 : 3 (.0101) 
9 : 2 (.0028) 0:1 (0070 
=5 : 2 (.0147) :9 10:1 (0188 
1:11 3:0 (0034) 5 : 0 (.0090) 
3:10 5 : 0 (.0204) 6:6 
3: 
4:8 
(0129) 0 (.0018) 3:0 (0071) 
1:11 20.0015) | 5:1 0: 13 Corrs) 
6: 3:6 
Se 2:10 6 : 0 (.0045) 6 : 0 (.0113) 1:12 aes 
3:9 6 : 0 (.0249) 59) 
5:7 £13 {0021} 0063) 
4:3 (.0090) 0: 12 4:0 (.00 
2:10 7:0 7:0 .0065) N2= 5 0016) | 3:2 
: > 0 (0147 
“373 (0036) | 4:4 || 3:40 
73 (. 73 (. : 
4:01 7:1 (.0032) 7:1 (.0099) 0039) 0173) 
Ws 2:10 8 : 0 (.0013) 7:1 (.0249) 0:13 4: 1 (.0029) | 4 : 1 (.0095) 
=. 3:9 8 : 0 (0039) 8 : 0 (0101) 2:11 | | : 1 (0237) 
4:8 0 (.0238) 2:11 0031) | 5: 0077) 
N.= 4:5(. 5:8 
sist 6 : 3 (.0015) 6 : 3 (.0090) : 3 (.0072) 
0:12 7 : 2 (0088) 
1: :2¢. :13 1 23 
3:9 9 : 0 (0024) 9 : 0 (0068) 3:10 | 72000015) | 6: 
4:8 9 : 0 (.0170) 3:10 : 0 (,0043) : 0 (.0101 
5:7 4:9 720 (-0221 
6:6 (.0096) 5:8 7: 
=1 75 (. 77 
: 4 (.0028) : 4 (.0155) 8 : 4 (.0117) 
| 7:3 (.0170) (.0028) 3:3 (0139) 
A, 1:11 8 : 2 (.0048) 8 : 2 (.0150) 0:13 +. 2 (.0032) | 5 : 2 (0112) 
: 6 8: 7: 
4:8 10 : 0 (.0046) 3 : 10 8 0 (.0024) 8 : 0 (-0063) 
6:6 37) || 5:8 — | 8:0¢ 
oe : 6 (.01 : 
(008) (0239) || 627 
0:12 8 : 3 (.0025) 
1:11 
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TABLE V—Continued 
SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
UNEQUAL SAMPLES UP TO N,; = 20, Nz = 19—Continued 
Smaller sample (N2)— Smaller sample (N:)— 
minimum differences Larger minimum differences 
sample 
Significant (Ni) Significant 
(P < .005) (P < .025) (P < .025) 
N.= 9 N. = 14 
5 : 4 (.0048) 4:5 (.0172) 2:12 — 
7:2 (.0015) 5 : 4 (.0231) 3:11 4:0 (.0114) 
8:1 (.0011) 6 : 3 (.0220) 4:10 4:0 (.0229) 
8 : 1 (.0037) 7:2 (.0170) 
9:0 (.0014) 8:1 (.0104) 
9:0 (.0040) 8:1 (.0247) 0:14 
— 9: 0 (.0101) 1:13 4:1 (.0061) 
_ 0: 9 (.0230) 4:1 (.0173) 
10 4:10 5 : 0 (.0108) 
0: :4(.0021) | 4: 6 (.0237) 5:9 3:0 021% 
: 2 (.0032 
3: : 1.0022) | 8:2 (.0101) 0:14 3:3 
‘i 4:2 (.0140) 
4: :0(.0009) | 9:1 (.0066) : : 
5: : 0 (.0026) | 9:1 (.0167) 2:12 5:1 (.0072 
6: aes 10 : 0 (.0070) 3 : 11 5 : 1 (.0180 
6: 0:10 (.0170)| 4: 30 
No= 11 6:8 6 : 0 (.0238) 
0: 5:5 (.0038) | :§ (.0109) 
: 3 (.0016 6: .0184 
2: : 2 (.0018) | 7:4 (.0213) | 9: 14 
: 0 (.0018) | 10 : 1 (.0113) 
4:10 6:1 (.0209 
6: 11 :0(.0050—) 5:9 7:0 (.0068 
6: 0:11 (.0128) 6:8 7:0 (.0148 
12 
0: :5(.0016) | 5:7(.0149) || | §:3(. 4:4 (.0096 
1: 7:5 (.0100) 1:13 6:2(. 5:3 (.0109 
2: : 3.0041) | 8:4 (.0127) 2:12 7:10. 6 : 2 (.0084 
3: : 2 (.0038) | 9:3 (.0131 3:11 6:2 
4: : 1 (.0027) | 10 : 2 (.0114) 4:10 8:0(. 7:1 (.0119 
: 0 (.0012) 5:9 8:0(. 
: — > 12 (.024 6:8 — 8 : 0 (.0094 
6: 0 (.0036) | 11 : 1 (.0202) 7:7 a 8 : 0 (.0201 
6: 0 : 12 (.0097) 
2=. 
2 / 0:14 §<4¢ 4 
—_ 2 : 0 (.0083) 1:13 623%: 5 : 4 (.0183) 
6: 7: 2 (.0122) 
3 4:10 9:0(. 8 : 1 (.0070) 
2 3 : 0 (.0147) 9 01405 
=4 
3:1 (.0049) 0:14 4: 6 (.0198) 
4:0 (.0016) 3:1 (.0186) 3393 6 : 4 (.0088) 
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SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
UNEQUAL SAMPLES UP TO N,; = 20, Neg = 19—Continued 


Smaller sample (N2)— Smaller sample (N2)— 
minimum differences Larger minimum diffcrences 
le 
Highly Significant *“<N,) Highly Significant 
N:= 10 Ni= 15 N.= 4 
8 : 2 (.0022) 0:15 3:1 (.0041 
7:3 (.0245) 1:14 4:0 (.0013 : 1 (.0158) 
9:1(.0041) | 8:2 (.0180) 2:13 4:0 (.0039 — 
10 :0(.0041) | 9:1 (.0245) 4:11 4:0 (.0181 
N:= 2:13 | | 4:1 0140) 
: § (.0026 5:6 0087) 036} 
7:4 (.0045) | 6:5 (.0142 4:11 5:0 (.0081 
:0(. :1(. 0:15 4:2(.0025) | 3:3 (.0150 
11 : 0 (.0028) | 10: 1 (.0172) 1:14 5:1(.0017) | 4:2 rrr 
0:11 (.0170))) 2:13 6:0(.0005) | 5:1 
11 : 0 (.0071) 3:12 6:0 5:1 (.0139 
4:11 6: 0 (.0039 _ 
12 5:10 6:0 
9:3 (.002 N= 7 
0:15 4:3 (.0048) | 3:4 (.0227 
9 : 3 (.0236) 1:14 5:2 (.0043 4:3 (.0207 
11 : 1 (.0047) | 10 : 2 (.0187) 2:13 6:1 (.0023 § :2 (.0135 
12 2000019 11:1(¢.0121 7:0 (.0007 6:1 (.0066 
0 : 12 (.0130) 4:11 0019 6:1 (.0155 
N.= 13 7:8 7:0 .0201 
5:8 N:= 8 
8:5 (.0040) | 7:6(.0114 0:15 5:3(.0017) | 4:4 (.0079 
10 :3(.0016) | 8:5 (.0151) 1:14 6:2(.0017) | 5:3 (.0084 
11 :2(.0015) | 9:4 (.0166) 2:13 7:1(.0010) | 6:2 (.0062 
11 : 2 (.0048 3:12 7:1 (.0033) | 6:2 (.0166 
12 : 1 (.0032) | 11 : 2 (.0130) 4:11 8:0(.0010) | 7:1 (.0084 
0 : 13 (.0248)|) 8:0 (.0026) | 7:1 (.0188 
13 : 0 (.0014) 6:9 — 8 : 0 (.0061 
13 : 0 (.0039) | 12: 1 (.0215) 
N.= 9 
2 0:15 5 :4(.0030) | 4:5 (.0119 
1:14 6:3 (.0037 5:4 (.0146 
— 2:0 (.0221 2:13 7: 2.0030) | 6:3 (.0127 
3:12 8:1 (.0016) | 7:2 (.0089 
N.= 3 4:11 8:1 (.0047 7:2 (.0213 
sie 2:1 (.0196) 5:10 9:0(.0015) | 8:1 (.0113 
3:0 (.0049 6:9 9:0 (.0038) | 8:1 (.0245 
3:0 7:8 9 : 0 (.0088 
_ 3:0 (.0245 7:8 _ 0:9 (.0186 
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sample 
Ni= 14 
2:22 
3:11 
4: 
5:9 
6:8 
6:8 
0:14 
4:10 
$ 
6:8 
0:14 
1:13 
5:9 
6:8 
6:8 
ie 1:13 
2:12 
4:10 
§:9 
6:8 
0:15 
1:14 
0:15 
1:14 
2:13 
3:12 
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TABLE V—Continued 
SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
UNEQUAL SAMPLES UP TO N; = 20, Nz = 19—Continued 
Smaller sample (N.)— Smaller sample (N3)— 
Larger minimum differences Larger minimum differences 
(Ni) si Significant (Ni) sionth Significant 
gnificant 
< .005) (P < .025) 00s) | (P < -025) 
Ni = 15 10 Ni = 15 N:= 14 
0:15 :§ (.0047) 4:6(.0166) | 5:10 | 13:1 0012) 11:3 (.0180 
1:14 7:3(.0017) | 5:5 — 6:9 13 : 1 (.0037) | 12: 2(.0144 
8:2 (.0014) | 6:4 (.0221 6:9 0 : 14 (.0105) 
3:12 8:2 1.oem 7:3 1 a 7:8 14 : 0 (.0015) | 13 : 1 (.0095) 
4:11 9:1 — Hie 7:8 0 : 14 (.0041)| 1: 13 (.0225) 
5:10 10 : 0 (.0009 
6:9 10:0 ¢ 0028) Nis N.= 2 
7:8 0:10 (.0134)|| 1:15 ‘on 2 
N.= 11 N.= 3 
$:§(.0020 | 4:7 | 9:16 3:0 (.0010) | 2:1 (.0176) 
1:14 7:4 (.0033 6:5 (.0110) 1:15 3:0(.0041 otis 
2:13 8:3 (.0035) | 7:4 (.0119) 2:14 3:0 (.0103 
5:10 |10:1 9 : 2 (.0187) 
6:9 11 : 0 (.0016) | 10: 1 (.0110) 0:16 3:1 (.0035 ~ 
6:9 0: 11 (.0217) 
1:15 4:0(.0010) | 3:1 (.0134) 
7:8 11 : 0 (.0041) | 10 : 1 (.0243) 2:14 4:0 (0031 
4:12 4:0 (.0145 
12 
0:15 6 : 6 (.0031) :7 (.0098) 
. . 2= 
| 0: 16 4:1 (.0008) | 3:2 (.0075) 
2:13 9:3 (.0019) 
3:12 | 10:2(.0016) | 8:4 (.0192) 1:15 4:1 (.0039 
4:11 10 : 2 (.0048) | 9:3 (.0166 2:14 5:0 ‘toast 4:1 (.0114) 
5:10 | 11:1 (.0029 10:2 (0139) 3:13 5: 0 (.0028 
6:9 12 :0(.0011) | 11: 1 (.0075 4:12 = 70 ¢. 
6:9 0:12 (.0169)|| 5:11 5:0 
7°8 12 : 0 (.0029) | 11 : 1 (.0176) 6:10 <= 5:0(. 
7:8 — 0 : 12 (.0072) a 
N.= 13 0: 16 4:2(.0021) | 3:3 (.0130 
0:15 6 : 7 (.0046) 1:15 5:1 ‘0013 
1:14 8:5(.0029) | 6:7 (.0232 2:14 5:1 (.0043 — 
2:13 9:4 (.0037) | 8:5(.0111 3:13 6:0(.0011) | 5:1 (.0109 
3:12 10 : 3 (.0037) | 9:4 (.0117 4:12 6:0 (.0028) | 5:1 (.0231 
$:10 | 12:1 C0019) | 11 :2¢.008% || 6:10 61000124 
532 .001 : : . 
6:9 13 : 0 (.0007) | 11 : 2 (.0201 7:9 _ 6 : 0 (.0230) 
6:9 — 13 (.0133) 
1:15 5 :2(.0034) | 4:3 (.0172 
14 2:14 6:1 (.0017 4:3 
0:15 7:7 (0022) | 5:9 (.0169 3:13. | 6:1¢.0049 Ror 
e246 8 : 6 (.0047 7:7(.0127 4:12 7:0(.0013 6:1 (.0116 
2:3 10 : 4 (.0021 8 : 6 (.0173 7:0 (.0032 6:1 (.0239 
3:12 11: 3.0022) | 9:5 (.0197 6:10 7:0 (.0070 
4:11 12 : 2 (.0019) | 10 : 4 (.0198 7:9 7:0 (.0140 


, 


CANADIAN JOURNAL OF RESEARCH. VOL. 26, SEC. E, 


TABLE V—Continued 


SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 


UNEQUAL SAMPLES UP TO N; = 20, Ne = 19—Continued 


Smaller sample (N2)— 
minimum differences , 


Smaller sample (N2)— 
minimum diffcrences 


Highly — Highly 
significant ons) (Ni) significant 
(P < .005) (P < .005) 
N:= 8 Ni= 16 N.= 1 
5:3 (.0013) | 4:4 (.0066) 6:10 11 : 1 (.0047) | 10 : 2 (.0192 
6 : 2 (.0013) | 5:3 (.0079) 6:10 — 0 : 12 (.9213) 
6 : 2 (.0048) | 5:3 (.0207) 7:9 12 : 0 (.0017) | 11 : 1 (.0111) 
7:1(.0024) | 6:2 (.0127) 7:9 — 0 : 12 (.0097) 
8 :0(.0007) | 7:1 (.0060) 8:8 12 : 0 (.0041) | 11 : 1 (.0241) 
8:0(.0018) | 7:1 (.0136) 
8 : 0 (.0041) 2= 
Be 8 : 0 (.0088) 0: 16 6:7 (.0036) | 5:8 (.0108) 
8: 0 ( 0175) 1:25 (.0022) 6:7 (.0187) 
2:14 9:4 (.0026) | 7 
No= 9 3:13 10 : 3 (.0025) | 8:5 (.0234) 
0: 5:4(.0024) | 4:5 (.0100) 4:12 12: 1(.0003) | 9:4 (.0217) 
6:3 (.0029) | 5:4 (.0119) 5:11 12: 1.0011) | 10 : 3 (.0180) 
2: 7:2 (.0022) | 6:3 (.0098) 6: 10 12 : 1 (.0031) | 11 : 2 (.0131) 
3% 8:1(.0011) | 7:2 (.0065) 6:10 0 : 13 . (0169) 
4: 8: 1 (.0033) | 7:2 (.0158) 7:9 13 : 0 (.0011) | 12 : 1 (.0078) 
$3 9:0(.0010) | 8:1 (.0080) 7:9 — 0 : 13 (.0073) 
6: 9:0 (.0025) | 8:1 (.0172) 8:8 13 : 0 (.0030) | 12 : 1 (.0178) 
0:9 (0238) N:= 14 
— 9 (.0238 
; 0:16 7:7 (.0017) | 5:9 (.0141) 
8: 9:0(¢.0119) 4245 8 : 6 (.0035) | 7:7 (0099) 
3:13 | 11:3C0018) | 928 C0130) 
0: 5:5 (.0038) | 4:6 (.0141 : 3 (.001 -O1 
1: | || 4212 | 11:3 (0046) | 10: 4 (0140) 
2 7:3 (.0048) | 6:4 (.0174) 5:11 12 : 2 (.0035) | 11 : 3 (.0121) 
3: 8 : 2 (.0035) | 7:3 (.0137) 6:10 13 : 1 (.0022) | 12 : 2 (.0091) 
4: 9:1 (.0018) | 8:2 (.0091) 6:10 0 : 14 (.0135) 
5: 9:1 (.0047) | 8 : 2 (.0207) 7:9 14 : 0 (.0008) | 12 : 2 (.0213) 
6: 10 :0(.0015) | 9:1 (.0110) 7:9 oe 0 : 14 (.0056) 
7: 10 : 0 (.0037) | 9:1 (.0230) 8:8 14 : 0 (.0022) | 13 : 1 (.0134) 
7: — 0 : 10 (.0174) No = 15 
8: 10 : 0 (.0082) 0:16 7:8 (.0024) | 5: 10 (.0177) 
0: 6 : 5.0016) | 4:7 (.0188 10 5 (0008 
1: 33 11 : 4 (.0031) 9 : 6 (.0227) 
3: 9 : 2 (.0019) 7:4 (.0244) $241 13 : 2 (.0023) | 11 : 4 (.0228) 
4: 10 :1(.0010) | 8:3 (.0192) 6:10 14 : 1 (.0015) 12 : 3 (.0200) 
5: 10 : 1 (.0029) | 9:2(.0130) || ©: 10 ae 0 : 15 (0109) 
6: 11 :0(.0009) | 10:1(.0071) || 7:9 14 : 1 (0041) | 13 : 2 (0157) 
7: 11 : 0.0024) | 10:1 (.0158) || 7:9 : 15 (0044) 1 : 14 (0234) 
7. ise 0:11 (.0129) 8:8 15 : 0 (.0016) | 14: 1 (.0102) 
11 : 0 (.0058) N.=2 
12 
0: 6 : 6 (.0025) | 4:8(.0242) || 1:16 2 :0 (0176) 
1: 7:5 (.0043) | 6:6 (.0132) N.= 3 
2: 8:4 (.0048) | 7:5 (.0149) 0:17 3:0 (.0009) | 2:1 (.0158) 
3: 9:3 (.0043) | 8:4 (.0141) 1:16 3 : 0 (.0035) — 
4: Hye 9 : 3 (.0117) 2:15 — 3: 0 (.0088 
5: 11 : 1 (.0018) | 10 : 2 (.0083) 3:14 — 3 : 0 (.0175) 


148 
; 
N 
if 
Wit 
* 


MAINLAND: STATISTICAL METHODS—TABLES 149 
TABLE V—Continued 
SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
UNEQUAL SAMPLES UP TO N; = 20, Ne = 19—Continued 
Smaller sample (N2)— Smaller sample (N2)— 
minimum differences minimum differences 
Larger Larger 
sample sample 
(Ni) Significant (Ni) Pr Significant 
(P < .005) (P < .025) (P< .005) (P < .025) 
Ni= 17 N.= 4 Ni = 17 N.= 9 
3: 1 (.0030) 7:2 (.0016) | 6:3 (.0077) 
1:16 4:0(.0008) | 3:1 (.0116) 3:14 7:2 (.0048) | 6:3 (.0199) 
4:0 (.0025) 4:13 8 :1(.0023) | 7:2 (.0119) 
4:0 (.0058) § 9:0(.0006) | 8: 1 (.0056) 
4:13 4:0 (.0117) 6:11 9:0(.0016) | 8:1 (.0121) 
4:0 (.0211) 73: 9:0 (.0037) | 8:1 (.0243) 
8:9 = 9 : 0 (.0078) 
5 8:9 0: 9 (.0156) 
0:17 4:1 (.0007) | 3:2 (.0065) 
1:16 4:1 (.0032) | 3:2 (.0239) N.= 10 
22335 5 :0(.0008) | 4:1 (.0093 0:17 5 :5(.0031) | 4:6 (.0120) 
3:14 5 :0(.0021) | 4:1 (.0207) 1:16 6:4 (.0041) | 5:5 (.0152) 
4:13 5 : 0 (.0048) 7:3 (.0037) | 6:4 (.0138) 
5:12 5 : 0 (.0096) 3:14 8:2 (.0025) | 7:3 (.0104) 
6:11 —- 5 : 0 (.0175) 4:13 9:1(.0012) | 7:3 (.0244) 
9:1 (.0032) 8:2 (.0151) 
N:= 6 6:11 10 :0(.0009) | 9:1 (.0075 
O:i7 4:2(.0017) | 3:3 (.0113) 7:10 10 : 0 (.0023) | 9:1 (.0158) 
1:16 5:1(.0010) | 4:2 (.0078) 7:10 a= 0: 10 (.0219) 
5:1 (.0034) 4:2 (.0212) 8:9 10 : 0 (.0052 
3: 6:0(.0008) | 5:1 (.0084) 8:9 —- 0 : 10 (.0110) 
4:13 6:0(.0021) | 5:1 (.0183) 
5222 6 : 0 (.0046) |. N:= 11 
6:11 - 6 : 0 (.0092) 6:27 5 :6(.0047) | 4:7 (.0161) 
7:10 -— 6 : 0 (.0170) 1:16 7:4(.0019) | 5:6(.0221) 
8 : 3 (.0018) 6:5 (.0221) 
N:= 7 3:14 9 : 2 (.0012) 7: 4 (.0189) 
0:17 4:3 (.0033) | 3:4 (.0173) 4:13 9:2 (.0037) | 8:3 (.0141) 
1:16 5 :2(.0028) | 4:3 (.0145) ea 10: 1(.0019) | 9:2 (.0092) 
6:1 (.0013) | 5:2 (.0086) 6:11 10: 1(.0047) | 9:2 (.0201) 
3:14 6:1 (.0037) | 5:2 (.0207) 77% 11 :0(.0015) | 10: 1 (.0106) 
4:13 7:0(.0010) | 6:1 (.0088) 73% _- 0: 11 (.0164) 
7:0 (.0023) | 6:1 (.0184) 8:9 11 :0(.0035) | 10: 1 (.0219) 
6:11 7:0(.0050—) 8:9 0 : 11 (.0078) 
7:10 — 7 : 0 (.0099) 
8:9 7: 0 (.0186) 12 
0:17 6:6(.0019) | 4:8 (.0208) 
8 6 : 6 (.0106) 
0:17 5 :3(.0011) | 3:5 (.0243) 2:15 8:4 (.0036) | 7:5 (.0116) 
1:16 6 : 2 (.0010) 4:4 (.0235) 3:14 9:3 (.0031) 8 : 4 (.0106) 
6:2 (.0036) | 5:3 (.0168) 4:13 10 : 2 (.0021) | “9 : 3 (.0084) 
3:14 7:1 (.0017) 6 : 2 (.0098) 11 »1(.0011) 9:3 (.0197) 
7:1 (.0045) 6: 2 (.0221) 6:11 11 : 1 (.0030) | 10: 2 (.0131) 
33.22 8:0(.0012) | 7:1 (.0100) 7:10 12 :0(.0010) | 11: 1 (.0071 
6:11 8:0 (.0028) | 7:1 (.0202) 7:10 — 0 : 12 (.0125) 
7:10 — 8 : 0 (.0060) 8:9 12 : 0 (.0024) | 11 : 1 (.0156) 
8:9 — 8 : 0 (.0119) 8:9 “=: 0 : 12 (.0057) 
8:9 0: 8 (.0225) 
N.= 13 
N:= 9 0:17 6:7 (.0029 5 : 8 (.0090) 
0:17 5:3 1:16 8:5 (.0015) 
4:16 6:3 (.0023) | 5:4 (.0097 2:15 9:4(.0019) | 7:6 (.0178 


‘ 
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TABLE V—Continued 
SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
UNEQUAL SAMPLES UP TO N; = 20, Ng = 19—Continued 
Smaller sample (N2)— Smaller sample (N2)— 
Larger minimum = dhl Larger minimum differences 
sample High! sample Hich! 
ghly ighly 
(P < .005) (P < .005) 
Ni= 17 N.= 13 Ni= 18 N.= 2 
3:14 10 :3(.0017) | 8:5 (.0179) 0:18 2 
4:13 11: 2(.0013) | 9:4 (.0159) 1:17 om 2:0 (.0158 
5:12 : 2 10 : 3 (.0127) 
7:10 12:1 (.0048) | 11:2(.0195) | 9:18 2:1 (.0143) 
8:9 | 13:0(.0017) | 12:1(.0111) 2:18 01305 
8:9 0 : 13 (.0042)} 1 : 12 (.0238) 
N.= 14 N:= 4 
0:17 6 : 8 (.0041 5 : 9 (.0118) 0:18 3: 1 (.0026) ~ 
1: 16 8 : 6 (.0026) | 6:8 (.0207) 1:17 4:0 (.0007) 3110289 
2:15 9:5 (.0034) | 8:6 (.0100) 2:16 4:0(.0021) | 3:1 (.0239 
3:14 40 : (.9108) 3:15 4:0 (.0048) 
4:13 11: 3(.0031) | 10:4 (. 4:14 = ¢ (.000 
5:12 13 10 : 4 (.0237) 5:13 4:0 (.0172 
6:11 13 : 1 (.0012) | 11 : 3 (.0192) 
6:11 ome 0 : 14 (.0168) N= 5 
7:10 13 : 1 (.0033) | 12 : 2 (.0137) 0:18 4:1 0005) | 3: 2 (.0056) 
7:10 0 : 14 (.0074) 1:17 4:1 :0027) 3:2 "0208 
8:9 14:0 (.0012) | 13:1 (0081) || 4:46 5:0(.0006) | 4:1 
8:9 0 : 14 (.0031)} 1:13 °(.0181)|] 348 3:0( 0017) | 4:1 (10172) 
4:14 5 : 0 (.0037) _ 
N:= 15 5:13 5:0 
0:17 2:5 5:10 (.0149)|| 6:12 5:0 
1:16 8:7 (.0041 7:11 5:0 (.0235 
2:15 10 :5(.0019) | 8:7 (.0149) 
4:1 12:3 (.001 01 
5:12 13 : 2 (.0015) | 11 : 4 (.0160) 
6:11 | 13:2 (0040) | 12:3 (0134) 9217 | | 
7:10 | 14:1 (.0024) | 13 : 2 (.0098) 
8:9 15 : 0 (70009) | 13 : 2 (.0222) 0 (.0034) 
8:9 0 : 15 (.0023)} 1:14 (.0139)) 9:12 
8: 10 6 : 0 (.0223) 
2= 
0-217 *7:9(.0027) | 5:11 (.0184) 
1:16 9:7(.0021) | 7:9 (.0149) N:= 7 
2:15 10 : 6 (.0033) | 8:8 (.0210) 0:18 4:3 (.0028) 3:4 
3:14 1: 5 (.0039) | 10 : 6 (.0106) 1:17 5:2(.0021) | 4:3 (.0123 
4:13 12 : 4 (.0040) | 11 : 5 (.0113) 2:16 6:1(.0010) | 5:2 (.0070) 
5:12 13 : 3 (.0036) | 12 : 4 (.0109) 3:15 6:1(.0029) | 5:2 (.0168) 
6:11 14 : 2 (.0028) | 13 : 3 (0094 4:14 6: 1 (.0068) 
6:11 0:16(.0112)|| 5:13 7:0(.0016) | 6:1 (.0142) 
7:10 : 1 (0017) | 13 : 3 (0218) 6:12 7: 0 (.0036) 
7:10 0008) 1:15 (.0242)|] 7:11 7: 0 (.0071) 
8:9 15: 1 (.0044) | 14 : 2 (.0168) 8:10 
8:9 0: 16 (.0018)| 1:15 (.0107)|| 9:9 7:0 (.0238 
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TABLE V—Continued 


SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
UNEQUAL SAMPLES UP TO N; = 20, Nz = 19—Continued 


Smaller sample (N2)— Smaller sample (N:)— 
Larger minimum differences Larger minimum differences 
sample Hichl sample Hichl 
(Ni) Significant (Ni) ved Significant 
gnificant significant 
(P< | (P < -025) (P< .005) | (P < -025) 
Ni= 18 N.= 8 Ni= 18 N.= 12 
0:18 3:5 (.0215 0:18 6 :6(.0016) | 4:8 (.0181 
207 5:3 (.0045 4:4(.0201 7:5 (.0025 6 : 6 (.0086 
2:16 6:2 (.0028) | 5:3 (.0138 2:16 8 :4(.0026) | 7:5 (.0091 
3:15 7:1 (.0012) | 6:2 (.0077 3:15 9:3 (.0022) | 7:5 (.0241 
4:14 7:1 6:2 (.0174 4:14 10 : 2(.0020) | 8:4 (.0197 
5:13 8:0 (.0008) | 7:1 (.0074 5:13 9:3 (.0150 
6:12 8:0(.0019) | 7:1 (.0151) 6:12 11 : 1 (.0020) | 10: 2 (.0091 
8 : 0 (.0041) 11 : 1 (.0046) 
9:9 0 C0186 $:10 12 : 0 (.0015) | 11 
N:= 9 9:9 12 : 0 (.0034) | 11 : 1 (.0209) 
0:18 5 :4(.0016) | 4:5 (.0072) 13 
2= 
0:18 : 7 (.0023) | 4:9 (.0227 
: : 2 (.0012) : 4 (.0234) 
1:17 7:6(.0041) | 6:7 (.0124 
3:15 7:2 (.0037) | 6:3 (.0158) | 2: 16 8:5 (.0047) | 7:6 (.0141 
4:14 8: 1.0016) | 7:2 (.0091) 3:15 9:4(.0044) | 8:5 (.0138 
5:13 8: 1.0040) | 7 : 2 (.0193) 4:14 10 : 3(.0035) | 9:4 (.0118 
6:12 9:0(.0011) | 8:1 (.0087) 5:13 11 : 2.0023) | 10 : 3.0090 
7:11 9:0 (.0024) | 8:1 (.0176) 6:12 12 : 1 (.0012) 
8:10 9 : 0 (.0052) 12 : 1 (.0030) 
|| 8210 | 13:0(0010) | 12 (6071). 
8:10 — 0 : 13 (.0055) 
9:9 13 : 0 (.0024) | 12 : 1 (.0153) 
N.= 10 
0:18 4: 6 (.0103) 14 
6:4 (.0033) | 5:5 (.0126) 0:18 : 8 (.0033) | 5:9 
2:16 6 : 4 (.0110) 1:17 8:6 (.0020) | 6:8 (.0171 
3:15 8:2 pot 7 : 3 (.0080 2:16 9:5(.0025) | 7:7 (.0207 
4:14 8:2 (.0048) | 7:3 (.0189) 3:15 10 : 4 (.0026) | 8:6 (.0215 
5:13 9:1 (.0022) | 8:2 (.0111) 4:14 11:3(.0021) | 9:5 (.0202 
6:12 10 :0(.0006) | 8 : 2 (.0230) 5:13 12: 2(.0015) | 10: 4 (.0173 
7:11 10 :0(.0015) | 9:1 (.0111) 6:12 12 : 2 (.0039) | 11 : 3 (.0134) 
10 : 0 (.0033) 6:12 0 : 14 (.0205) 
: = > 10 (. 7:11 13 : 1 (.0021) | 12 : 2 (.0090) 
9:9 —_ 10 : 0 (.0070) 7:11 ~- 14 (.0095) 
8:10 |13:1(.0050—)} 12 : 2 (.0197) 
No= 11 8:10 0 : 14 (.0042)| 1: 13 (.0235) 
0:18 le 4:7 (.0139) 9:9 14 : 0 (.0017) | 13 : 1 (.0112) 
1:17 7:4(.0014) | 5:6 (.0185) 
2:16 8:3 (.0013) | 6:5 (.0179) 15 
3:15 8:3 (.0041) | 7:4 (.0148) 0:18 6:9 (.0045) | 5: 10 (.0127) 
4:14 9: 2.0026) | 8:3 (.0106) 1:17 6 : 9 (.0226) 
5:13 10:1(.0012) | 8:3 2:16 9:6 (.0043) | 8:7 (.0116 
6:12 10 : 1 (.0031) etree 3:15 et 9 : 6 (.0129 
7:11 11 : 0 (.0009) | 10 : 1 (.0071) 4:14 11:4 (.0044) | 10: 5 (.0127 
7:11 — 0:11 (.0204)|) 5:13 11:4 (.0113 
8:10 11 : 0 (.0022) | 10: 1 (.0148) 6:12 13 : 2 (.0026) | 12 : 3 (.0090 
8:10 0:11 (.0102)|} 6:12 0 : 15 (.0168) 
9:9 11 : 0 (.0049) — 7:11 14 : 1 (.0014) | 12 : 3 (.0203) 
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TABLE V—Continued 
SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
UNEQUAL SAMPLES UP TO N, = 20, Nz = 19—Continued 
Smaller sample (N2)— Smaller sample (N2)— 
Larger minimum differences Larger minimum differences 
sample : We sample i 
1) Significant (Ni) Significant 
Ni = 18 15 Ni= 19 N.= 4 
0: 15 (.0075) 4:0 (.0142) 
8:10 14 : 1 (.0036) | 13 : 2 (.0143) 6:13 4:0 (.0237) 
8:10 0:15 (.0032)} 1: 14 (.0183) 
9:9 15 : 0 (.0013) | 14: 1 (.0084) N.= 5 
0:19 3:2 (.0049) 
N.= 1 1:18 4:1 (.0022) | 3:2 (.0184) 
0:18 7:9(.0021) | 5:11 (.0157) 23.07 5 :0(.0005) | 4:1 (.0065) 
8 : 8 (.0046) 7:9 (.0120) 5 : 0 (.0013) 4:1 (.0145) 
236 10 : 6 (.0023) 8 : 8 (.0167) 4:15 5 : 0 (.0030) _ 
11 : 5 (.0028) 9:7 (.0193) 5:14 5 : 0 (.0059) 
4:14 12 : 4 (.0027) | 10 : 6 (.0204) 6: is 5 : 0 (.0109) 
43 13 : 3 (.0024) | 11 : 5 (.0197) 7212 5: 0 (.0186) 
6:12 14 :2(.0017) | 12 : 4 (.0177) 
6:12 0 : 16 (.0138) N2= 6 
yee 14 : 2 (.0044) | 13 : 3 (.0145) 0:19 4:2(.0012) | 3:3 (.0087) 
7:11 0:16 (.0059)|| 1:18 5: 1.0006) | 4:2 (.0054) 
8:10 15 : 1 (.0025) | 14: 2 (.0105) 5:1 (.0021) 4:2 (.0151) 
8:10 0: 16 (.0024)} 1: 15 (.0143) 3:16 6 : 0 (.0005) 5:1 (.0056) 
9:9 16 : 0 (.0009) | 14 : 2 (.0229) 4:15 6:0(.0012) | 5:1 (.0119) 
5:14 6 : 0 (.0026) 5:1 (.0225) 
17 6:13 6 : 0 (.0052) 
0:18 7:10 (.0029)| 5:12 (.0191) 6 : 0 (.0097) 
9:8(.0025) | 7:10 (.0159) 8:11 6 : 0 (.0170) 
2:36 10 : 7 (.0039) 8 : 9 (.0228) 
3:35 11 : 6 (.0048) | 10: 7 (.0122) N.= 7 
4:14 13 : 4 (.0017) | 11 : 6 (.0134) 0:19 4:3(.0023) | 3:4 (.0135) 
5343 13 :4( 0050—)| 12 : 5 (.0134) 4:28 5 :2(.0017) | 4:3 (.0104) 
6:12 13 : 4 (.0124) 2 257 6:1 (.0008) | 5:2 (.0057) 
6:12 -— 0:17 (.0114) 3346 6:1 (.0023) 5 : 2 (.0138) 
15 : 2 (.0031) | 14: 3 (.0105) 4:15 7:0 (.0005) 6:1 (.0053) 
0:17 (.0047)) 1: 16 (.0249) 7:0 (.0012) 6:1 (.0110) 
8:10 16: 1(.0019) | 14 : 3 (.0233) 6:13 7:0 (.0026) | 6:1 (.0209) 
8:10 0:17 (.0019)} 1: 16 (.0112) 7333 7:0 (.0052) 
9:9 16 : 1 (.0047) | 15 : 2 (.0178) 8:11 wi 7: 0 (.0098) 
9:10 7:0 (.0174) 
Ni= 19 N.= 2 
0:19 2:0 (.0048) N.= 8 
38 2:0 (.0143) 0:19 4:4(.0040) | 3:5 (.0191) 
1:18 5:3 (.0037) | 4:4 (.0172) 
N.= 3 2:17 6:2 (.0023) | 5:3 (.0114) 
0:19 3:0 (.0006) | 2:1 (.0130) 3: 16 7:1 (.0009) | 6:2 (.0061) 
1:48 3:0 (.0026) -- 4:15 7:1 (.0024) | 6:2 (.0138) 
2247 3 : 0 (.0065) 5:14 8 :0(.0006) | 7:1 (.0056) 
3:16 3 : 0 (.0130) 8 : 0 (.0014) 7:1 (.0114) 
4:15 3:0 (.0227) 8:0 (.0029) | 7:1 (.0215) 
8 : 0 (.0058) 
N.= 4 9:10 8 : 0 (.0110) 
0:19 3: 1.0023) | 2:2 (.0237) 9:10 0:8 (.0197) 
2358 4:0(.0005) | 3:1 (.0087) 
a2a7 4:0(.0017) | 3:1 (.0208) N.= 9 
3:16 4:0 (.0040) --- 0:19 5 :4(.0013) | 4:5 (.0062) 
4:15 4:0 (.0079) 1:18 6:3 (.0013) | 5:4 (.0066) 
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TABLE V—Continued 
SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
UNEQUAL SAMPLES UP TO N, = 20, Ne = 19—Continued 
Smaller sample (N2)— Smaller sample (N2)— 
Larger minimum differences Larger minimum differences 
sample sample 
(Ni) Significant (Ni) Significant 
significant significant 
(P< (P < .025) .005) (P < .025) 
Ni = 19 N.=9 Ni = 19 N.= 12 
2:17 6:3 (.0048) | 5:4 (.0195) : 10 12 : 0 (.0021) | 11 : 1 (.0140) 
3:16 7:2 (.0028) | 6:3 (.0127) 9:10 0 : 12 (.0046) — 
5:14 8:1 (.0029 : 2 (.0149 2= 
6:13 9:0(.0007) | 8:1 (.0064) 0:19 6:7(.0019) | 4:9 (.0199) 
73%2 9:0(.0017) | 8:1 (.0128) 1:18 7: 6 (.0032) | 6:7 (.0102) 
8:11 9:0 (.0035) | 8:1 (.0241) 2:17 8 : 5 (.0036) | 7: 6 (.0114) 
8:11 Piri 0 : 9 (.0243) 3:16 9:4 (.0033) | 8:5 (.0106) 
: a : 4:15 10 : 3(.0026) | 9:4 (.0088) 
9:10 9 : 0 (.0070) 
0:9 (0134) | 14:2 C0040) | 10:3 C0145) 
10 7:12 12 : 1 (.0020) | 11 0090) 
0:19 5:5 (.0021) | 4: 6 (.0088) 7:12 0 : 13 (.0150) 
1:18 6:4 (.0027) | 5:5 (.0105) 8:11 12 : 1 (.0046) | 11 : 2 (.0190) 
8:11 — 0 : 13 (.0072) 
2:17 7:3 (.0021) | 6:4 (.0089) 
3:16 8:2 (.0013) | 6:4 (.0224) 9:10 13 : 0 (.0014) | 12: 1 (.0099) 
4:15 8 : 2 (.0036) | 7:3 (.0148) 9:10 0 : 13 (.0033)|} 1 : 12 (.0200) 
5:14 9:1 8:2 N= 14 
6:13 9:1 (.0037) | 8:2 (.0173 
7:12 | 10:0(.0010) | 9:1(.0078) || 9:12 | ©:8¢.0027) | 4:10 ( 0245) 
: 1:18 7:7(.0049) | 6:8 (.0142) 
2:17 9:5 (0019) | 7:7 (.0168) 
AC. : 3:16 10 : 4(.0018) | 8: 6 (.0169) 
9:10 10 : 0 (.0046) 4:15 | 11:3(.0015)| 9:5 (.0153) 
9:10 0:10 (.0092)) 5:14 =| 11: 3 (0039) | 10 : 4 (0127) 
; 6:13 12 : 2 (.0026) | 11 : 3 (.0094) 
11 ; 
3:16 8:3 (.0031) | 7:4 (.0115) 
4:15 9:2(.0019) | 8:3 (0080) || 8:1! 7 : 14 (-0054) 
5:14 | 9:2°(.0047) | 8:3(.0179) || 2:10 | 14:0(.0010) | 13 : 1 (0071) 
6:13 10 : 1 (0022) | 9 : 2 (0108) 9:10 0 : 14 (.0024)| 1: 13 (.0150) 
7:12 10 : 1 (.0048) | 9: 2 (.0213) No= 15 
7:12 0: 11 6 : 9 (.0037) | 5: 10 (.0108) 
: 0 (.0014) : 1:18 8:7 (.0024) | 6:9 (.0189) 
: oe 2:17 9 : 6 (.0032) | 7:8 (.0234 
11 : 0 (.0031) scenes) 3:16 
: . 4:15 11 :4(.0031) | 9:6 (.0243 
5:14 12 : 3 (.0024) | 10 : 5 (.0219) 
12 6:13 13 : 2 (.0016) | 11 : 4 (.0183 
: 75 (. 7 (. 7:12 13 : 2.0041) | 12 : 3 (.0140 
2:2 47 8:4 (.0020) | 6:6 (.0220) 7:12 ) 0 
3: 16 9:3 (.0016) | 7:5 (.0193) 8:11 14 : 1 (.0022) | 13 : 2 (.0094) 
4:15 9:3 (.0044) | 8:4 8:11 0:15 (.0042)} 1: 14 (.0234) 
5:14 10 : 2 (.0028) | 9:3 (.0106) 9:10 15 : 0 (.0007) | 13 : 2 (.0199) 
6: 13 11: 1(.0012) | 9:3 (.0227) 9:10 0:15 (.0018)| 1: 14 (.0113) 
7:12 11 : 1 (.0030) | 10: 
— 0:12 (.01 2= 
8:11 12 : 0 (.0009) | 11 : 1 (.0068) 0:19 6 : 10 (.0049)| 5:11 (.0135) 
8:11 — 0 : 12 (.0096)|} 1:18 8 : 8 (.0037) | 6: 10 (.0243) 


> 
' ‘ 
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TABLE V—Continued 


SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
UNEQUAL SAMPLES UP TO N; = 20, Nz = 19—Continued 


Smaller sample (N2)— 
minimum differences 


Smaller sample (N2)— 


Larger Larger minimum differences 
sample High! sample High! 
(Ni) Significant (Ni) Significant 
gnificant significant 
(P < .005) (P < .025) (P < .005) (P < .025) 
Ni= 19 N:= 16 Ni= 20 N.= 3 
23237 10:6 8:8 2:18 3:0 (.0056 
3:16 11 : 5 (.0019 9:7 (.0150 3:0 (.0113 
12 :4(.0018) | 10:6 4:16 3:0 ( 0198 
5:14 13 : 3 (.0015) | 11:5 (.0143 
6:13 13 : 3 (.0040) | 12 : 4 (.0124) N.= 4 
6:13 —_ 0 : 16 (.0167) 0:20 3:1 (.0020 2:2 (.0217 
7 342 14 : 2 (.0028) | 13 : 3 (.0098) 1399 4:0 (.0005 331 
0 : 16 (.0075) 2:18 4:0 (.0014 3:1 (.0184 
8:11 15 : 1.0015) | 13 : 3 (.0210) 4:0 (.0033) 
8:11 0 : 16 (.0032)} 1:15 (.0184) 4:16 — 4:0 (.0066) 
9:10 15 : 1 (.0037) | 14: 2 (.0148) eR oa 4:0 01983 
9:10 0: 16 (.0013)|} 1:15 (.0086) 6:14 4:0 (.0198 
N.= N.= 5 
0:19 7:10 (.0023)| 5:12 (.0164) 0:20 3: 2 (.0043) — 
1:18 9:8 ere 7: 10 (.0130) 1:19 4:1(.0019) | 3:2 {eee 
10 : 7 (.0029) | 8:9 2:18 5:0 (.0004) | 4:1 (.0054 
3:16 11 9 : 8 (.0217) 5:0(.0011) | 4:1 (.0122 
4:15 12 : 5 (.0036) | 10: 7 (.0234) 4:16 5:0 + oan 4:1 (.0235) 
5:14 13:4 00273 11 : 6 (.0233) S385 5 : 0 (.0047) 
14 : 3 (.0027) | 12: 5 (.0219) 6:14 -- 5:0 
6:13 0: 17 (.0139) 5:0 (.0149 
15 : 2 (.0019) | 13 : 4 (.0191) 8:12 5 : 0 (.0242) 
7312 0 : 17 (.0060) 
8:11 15 : 2 (.0048) | 14:3 (.0154) N.= 6 
8:11 0:17 (.0025)| 1:16 (.0146) 0:20 4:2(.0010) | 3:3 (.0077) 
9:10 16: 1(.0027) | 15: 2 (.0110 1299 4:2 (.0047) 
9:10 0.17 (.0010) | 2:15 (.0236) 2:18 5:1 (.0017) | 4:2 (.0129) 
S227 5 : 1 (.0045) 
N:.= 18 4:16 6:0(.0009) | 5:1 
0:19 7:11 (.0031)| 5:13 (.0197) 5335 6:0(.0020) | 5:1 (.0184 
1:18 9:9(.0028) | 7: i1 (.0168) 6:14 6 : 0 (.0040) _— 
10 : 8 (.0043) | 8: 10 (.0243) 72 6:0 
3:3 36 12 :6(.0021) | 10:8 8:12 6 : 0 (.0130) 
4:15 13 : 5 (.0023) | 11 : 7 (.0153) 9:11 — 6 : 0 (.0217) 
5:14 14 : 4,(.0022) | 12 : 6 (.0159) 
6:13 15 : 3 (.0018) | 13: 5 (.0154 7 
6: 13 0 : 18 (.0117) 0:20 4:3(.0020) | 3:4 (.0120 
7342 15 : 3 (.0048) | 14 : 4 (.0138) 1:19 $22 
0 : 18 (.0049) 2:33 (.0047) | 4:3 
16 : 2 (.0035) | 15 : 3 (.0115) 3317 6:1(.0017) |} 5:2 (.0114 
a: 31 0:18 (.0020)} 1:17 (.0116) 4:16 6:1 (.0041) | 5:2 (.0234) 
9:10 17 : 1 (.0020) | 15 : 3 (.0247) 5345 7:0(.0009) | 6:1 (.0087) 
9:10 0: 18 (.0007)} 2: 16 (.0185) 6:14 7:0(.0019) | 6:1 (.0165) 
7: 0 (.0039) 
8:12 7: 0 (.0072) 
2: 0 (.0130) 10: 10 7:0 (.0219 
N.= 8 
2:1 (.0119) 0:20 4:4 (.0034) | 3:5 
(.0031) | 4:4 (.0148 
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TABLE V—Continued 
SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
UNEQUAL SAMPLES UP TO N; = 20, Ng = 19—Continued 
Smaller sample (N2)— Smaller sample (N;)— 
ta minimum differences minimum differences 
ple 
Highly ‘oni Highly 
(Ni) si Significant (Ni) Significant 
gnificant significant 
(P< .005) | (P < -025) (P< 00s) | (P < -025) 
20 N.= 8 Ni = 20 N.= il 
2:8 6:2 5:3 9:11 11 : 0 (.0020) | 10 : 1 (.0140) 
3:17 6:2 (.0048) | 5:3 (.0223 9:11 0 : 11 (.0083) 
0019) 10 : 10 11 : 0 (.0042) | 
6:14 7:1 (.0087 N:= 12 
7:13 8:0(.0021) | 7:1 (.0164 0: 20 5:7 (.0039) | 4:8 (.0138 
8:12 8: 0 (.0041 oan 1:19 7:5(.0015) | 5:7 (.0185 
9:11 8 : 0 (.0078 2:18 8:4 (.0016) | 6:6 (.0182 
9:11 0 : 8 (.0243 3:17 8:4 (.0047) | 7:5 (.0156 
10 : 10 8:0 (.0141 4:16 9:3 (.0033) | 8:4 (.0118 
5:15 10 : 2(.0019) | 9:3 (.0079 
N:= 9 6:14 10 : 2 (.0046) | 9:3 (.0170 
0:20 5:4(.0011) | 3:6 (.0230 7313 11 : 1 (.0021) | 10 : 2 (.0098 
1:19 6:3(.0011) | 4:5 (.0223 7:13 _ 0.12 (.0230 
2:18 6:3 (.0039) | 5:4 (.0164 8:12 11 : 1 (.0046) | 10 : 2 (.0197 
3:17 7:2(.0021) | 6:3 (.0103) 8:12 — 0 : 12 (.0120) 
4:16 8: 1.0008) | 6:3 (.0224) 9:11 12 : 0 (.0013) | 11 : 1 (.0095) 
5:15 8:1(.0022) | 7:2 (.0116) 9:11 12 (. 
133 10 : 10 12.: 0 (.0029) | 11 : 1 (.0185) 
213 : ‘ 
8:12 9:0 8 : 1 (.0178) 13 
9:11 9:0 (.0049 0:20 6:7 4:9 
9:11 1:19 7:6 (.0026) | 5:8 (.0248 
10 : 10 9:0 (.0092 2:18 8:5 (.0028) | 7:6 (.0092 
9:4(.0025) | 7:6 (.0239 
N.= 10 4:16 10 : 3(.0018) | 8:5 (.0201 
0: 20 5:5 (.0018) | 4:6 (.0077) 5:15 10 : 3.0047) | 9:4 (.0153 
1:19 5 6:14 11 : 2 (.0028) | 10 : 3 (.0106 
2:18 7:3(.0017) | 6:4 (.0072 7:13 12 : 1 (.0012) | 10: 3 (.0218 
3:17 +23 C0049} 6:4 (.0184 7:13 0 : 13 (.0181) 
4:16 8 : 2 (.0026) | 7:3 (.0115) 8:12 12 : 1 (.0030) | 11 : 2 (.0132) 
5:15 9:1(.0011) | 7:3 (.0241 8:12 — 0 : 13 (.0091) 
6:14 9:1 ‘029 8:2 0131) 12:1(. 
7:13 10 :0(.0006) | 9:1(. : 13 (. 
10 : 0 (.0015) 10: 10 13 : 0 (.0020) | 12 : 1 (.0133) 
9:11 10 : 0 (.0031) | 9:1 (.0209) 14 
9:11 0:10(.0117)/) 0:20 6:8 (.0022) | 4: 10 (.0216) 
10 : 10 10 : 0 (.0061) 1:19 7:7 (.0040) | 6 
2:18 8:6 (.0047) | 7:7 (.0135) 
11 3:17 9:5(.0045) | 8 
0°: 20 5 :6(.0027) | 4:7 (.0105) 4:16 10 : 4 (.0038) | 9:5 (.0118) 
1:19 6:5 (.0036) | 5: 6 (.0132) 5:15 11 : 3 (.0028) | 10 : 4 (.0094) 
2:38 7:4 (.0033) | 6:5 (.0119) 6:14 10 
3:17 8 : 3 (.0023) | 7:4 (.0092 7:13 12 : 2 (.0041) | 11 : 3 (.0145) 
4:16 9:2(.0014) | 7:4 (.0212) 7:13 _- 0 : 14 (.0144) 
5:15 9 : 2 (.0034) | 8:3 (.0138) 8:12 13 : 1 (.0020) | 12:2¢. 
6:14 10:1 “0015 9 : 2 (.0077 8:12 — 0 : 14 (.0069) 
7:43 10 : 1 (.0033) | 9: 2 (.0157) 9:11 13 : 1 (.0045) | 12 : 2 (.0185) 
8:12 11 : 0 (.0009) | 10 : 1 (.0071) 9:11 0 : 14 (.0032)| 1: 13 (.0193) 
8:12 0: 11 (.0160)|/ 10:10 14 : 0 (.0014) | 13 


E 


TABLE V—Concluded 
SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
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UNEQUAL SAMPLES UP TO N, = 20, Ne = 19—Concluded 


Smaller sample (N2)— 
minimum differences 


Smaller sample (N2)— 
minimum differences 


sample sample 
(Ni) Highly Significant (Ni) Highly Significant 
significant (P< .025) significant (P< .025) 
(P < .005) (P < .005) 
Ni = 20 N.= 15 Ni = 20 17 
0: 20 5: 10 (.0093)|| 7:13 14 : 3 (.0045) | 13 : 4 (.0134) 
1:19 8:7(.0019) | 6:9 (.0159) 7:13 — 0: 17 (.0075) 
2:18 9:6 (.0024) | 7:8 (.0192) 8:12 15 : 2 (.0030) | 14: 3 ( 0103) 
3:17 10 : 5 (.0025) | 8:7 (.0200) 8:12 0: 17 (.0033)| 1: 16 (.0186) 
4: 16 11 :4(.0022) | 9:6 (.0189) 9:11 16 : 1 (.0016) | 14 : 3 (.0217) 
5:15 12 : 3(.0017) | 10: 5 (.0165) 9:11 0:17 (.0014) | 1: 16 (.0087) 
6:14 12 : 3 (.0043) | 11 : 4 (.0134) || 10:10 16 : 1 (.0039) | 15 : 2 (.0152) 
6:14 — 0 : 15 (.0239) 
7:13 13 : 2 (.0027) | 12 : 3 (.0098) N,= 18 
7:13 — 0:15 (.0115)|| 0:20 : 11 (,0025)| 5:13 (.0171) 
14 : 1 (.0014) is Conse) 1:19 9:9 (.0022) | 7:11 (.0139) 
: rag . 2:18 10 : 8 (.0033) | 8:10 (.0198 
9:11 4 : 1 (0032) | 13:2 (0134) || 3:17 | 11 (0082) 9:9 (6238). 
9:11 0:15 (.0024)| 1: 14 (.0147) : 
10: 10 15:0 (.0010) | 14 : 1 (.0070) 4:16 12 : 6 (.0044) | 11 : 7 (.0115) 
5:15 13 : 5 (.0044) | 12 : 6 (.0115) 
Be 6:14 14 : 4 (.0039) | 13 : 5 (.0109) 
0:20 6: 10 (.0041)} 5:11 (.0116)|| 6:14 0 : 18 (.0140) 
1:19 8:8 (.0028) | 6:10 (.0206)|| 7:13 15 : 3 (.0031) | 13 : 5 (.0237) 
2:18 9:7 (.0039) | 8:8 (.0105) 7:13 — 0 : 18 (.0061) 
3:17 10 : 6 (.0044) | 9:7 (.0117) 8:12 6 : 2 (.0021) | 14: 4 (.0204) 
4:16 11 : 5 (.0042) | 10 : 6 (.0116) 8:12 0.18 (.0026) | 1:17 (.0149) 
5:15 12 : 4 (.0036) | 11 : 5 (.0106) 9:11 17: 1(.0011) | 15 : 3 (.0163) 
6:14 13 : 3 (.0027) | 11:5 (.0233) || 9:11 7 18 (.0010)| 2 : 16 (.0243) 
6:14 — 0: 16 (.0199)|| 10:10 : 1 (.0029) | 16 : 2 (.0115) 
7:13 14 : 2 (.0018) | 12 : 4 (.0191) 
7:13 — 6 : 16 (.0093) N2= 19 
8:12 14 : 2 (.0043) | 13 : 3 (.0145) 0: 20 : 12 (.0033)| 5: 14 (.0202) 
8:12 0 : 16 (.0042)) 1:15 (.0232)|/| 1:19 9 : 10 (.0030)| 7:12 (.0176) 
9:11 15 : 1 (.0022) | 14 : 2 (.0096) 2:18 10 : 9 (.0049) | 9: 10 (.0116) 
9:11 0: 16 (.0018)| 1:15 (.0113)|| 3:17 12 : 7 (.0025) | 10 : 9 (.0149) 
10 : 10 16 : 0 (.0007) | 14 : 2 (.0199) 4: 16 13 : 6 (.0029) | 11 : 8 (.0171) 
5:15 14 : 5 (.0029) | 12 : 7 (.0182) 
N:= 17 6:14 15 : 4 (.0027) | 13 : 6 (.0182) 
0:20 7:10 (.0019)} 5:12 (.0142)/| 6:14 — 0: 19 (.0119) 
1:19 8:9 eo 7:10 (.0107)|| 7:13 16 : 3 (.0022) | 14 : 5 (.0171) 
2:18 10 : 7 (.0021) | 8:9 (.0147) 7:13 — 0: 19 (.0050+) 
3:17 1: 6(.0025) | 9:8 (.0171) 8:12 17 : 2 (.0015) | 15 : 4 (.0152) 
4: 16 12 : 5 (.0025) | 10 : 7 (.0179) 8:12 0: 19 (.0020)} 1 : 18 (.0121) 
5:15 13 : 4 (,0023) | 11 : 6 (.0174) 9:11 17 : 2 (.0038) | 16 : 3 (.0123) 
6:14 14 : 3 (.0018) | 12 : 5 (.0159) 9:11 0: 19 (.0008)} 2:17 (.0193) 
6:14 — 0:17 (.0167)|| 10:10 18 : 1 (.0022) | 17 : 2 (.0089) 
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TABLE VI 


SIGNIFICANT DIFFERENCES IN FOURFOLD CONTINGENCY TABLES— 
EQUAL SAMPLES; N = 20 AND OVER 


More A’s in Sample (2) than in Fewer A’s in Sample (2) than in 
Sample (1) Sample (1) 
N Percentage of A’s in Sample (1) Percentage of A’s in Sample (1) 
0 10 | 25 | 50 0 10 25 50 
Minimum highly significant percentage differences (P < .005) 
20 35.00 45.00 50.00 45.00 35.00 — — 45.00 
30 26.67 33.33 36.67 36.67 26.67 —- 26.67 36.67 
40 20.00 27.50 32.50 30.00 20.00 —_ 22.50 30.00 
50 16.00 24.00 28.00 28.00 16.00 — 21.00 28.00 
60 13.33 21.67 25.00 25.00 13.33 — 20.00 25.00 
70 11.43 18.57 22.86 22.86 11.43 —_ 17.86 22.86 
80 10.00 17.50 21.25 21.25 10.00 10.00 17.50 21.25 
90 8.89 16.67 20.00 20.00 8.89 10.00 16.11 20.00 
100 8.00 15.00 19.00 19.00 8.00 9.00 15.00 19.00 
200 4.00 10.00 12.50 13.50 4.00 7.00 11.00 13.50 
500 1.60 5.80 7.60 8.40 1.60 4.60 7.00 8.40 
1000 .80 3.90 5.30 5.90 0.80 3.30 5.00 5.90 
Minimum significant percentage differences (P < .025) 
20} 25.00 | 35.00 | 40.00 | 35.00 | 25.00°| — | 25.00 | 35.00 
30 | 20.00 26.67 30.00 30.00 20.00 — 23.33 30.00 
40 15.00 20.00 25.00 25.00 15.00 os 20.00 25.00 
50 | 12.00 18.00 22.00 22.00 12.00 -- 18.00 22.00 
60 10.00 16.67 20.00 20.00 10.00 10.00 16.67 20.00 
70 8.57 14.29 17.14 18.57 8.57 10.00 14.29 18.57 
80 7.50 13.75 16.25 17.50 7.50 8.75 13.75 17.50 
90 6.67 42.22 15.56 15.56 6.67 8:89 13.33 15.56 
100 6.00 11.00 14.00 15.00 6.00 8.00 12.00 15.00 
200 3.00 7.50 9.50 10.50 3.00 6.00 8.50 10.50 
500 1.20 4.40 5.80 6.40 1.20 3.60 5.40 6.40 
1000 0.60 2.90 4.00 4.50 0.60 2.60 3.80 4.50 


Note.—For mode of use see Example 19. N = total number of individuals in each sample. 
P = Probability. Irregularities in the sequence of percentages are due to the discontinuity of the 


distributions. 
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TABLE VII 
CHI SQUARE PROBABILITIES 


Degrees of freedom 


P = 0.10 


P = 0.05 


ar 


0.95 
5 0.475 


000157 | 0.00393 


0.10 
0.05 
2 2.706 


0.90 0.70 
0.45 0.35 
0.0158 0.148 
0.05 0.02 
0.025 0.01 
3.841 5.412 


the publishers, Messrs. Oliver and 


Note.—Extracted from Table IV of Fisher and Yates’s Statistical Tables,” with 
‘oyd. For mode of use see Examples 13, 20, and 27. 

= probability of obtaining, by random sampling, chi square values as great as, and greater than, 
the specified value. 


158 
1 2.706 3.841 6.635 
a 4.605 5.991 9.210 
a 6.251 7.815 11.345 
7.779 9.488 13.277 
ols 9.236 11.070 15.086 
Hf 10.645 12.592 16.812 
a 12.017 14.067 18.475 
13.362 15.507 20.090 
5 14.684 16.919 21.666 
an 15.987 18.307 23.209 
iy 11 17.275 19.675 24.725 
* 12 18.549 21.026 26.217 
i 13 19.812 22.362 27.688 
2 14 21.064 23.685 29.141 
15 22.307 24.996 30.578 
16 23.542 26.296 32.000 
oe 17 24.769 27.587 33.409 
oe 18 25.989 28.869 34.805 
19 27.204 30.144 36.191 
x 20 28.412 31.410 37.566 ‘ 
21 29.615 32.671 38.932 
ze 22 30.813 33.924 40.289 
ay: 23 32.007 35.172 41.638 
Se 24 33.196 36.415 42.980 
a 25 34.382 37.652 44.314 
i 26 35.563 38.885 45.642 
27 36.741 40.113 46.963 
ae 28 37.916 41.337 48.278 
29 39.087 42.557 49.588 
i 30 40.256 43.773 50.892 
, oe Additional details for one degree of freedom 
P 0.99) 0.30 
4P 0,49 0.15 
a a Chi square 0 1.074 
P 0.20 0.001 
0.10 0.0005 
Chi square 1.64 10.827 
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TABLE VIII 
FOuUR-PLACE LOGARITHMS OF FACTORIALS OF NUMBERS UP TO 1000 
No. Log of factorial No. Log of factorial No. Log of factorial 

1 0.0000 51 66.1906 101 159.9743 
2 0.3010 52 67 .9066 102 161.9829 
3 0.7782 53 69.6309 103 163.9958 
4 1.3802 54 71.3633 104 166.0128 
5 2.0792 55 73.1037 105 168.0340 
6 2.8573 56 74.8519 106 170.0593 
7 3.7024 57 76.6077 107 172.0887 
8 4.6055 58 78.3712 108 174.1221 
9 5.5598 59 80.1420 109 176.1595 
10 6.5598 60 81.9202 110 178.2009 
11 7.6012 61 83.7055 111 180.2462 
12 8.6803 62 85.4979 112 182.2955 
13 9.7943 63 - 87.2972 113 184.3485 
14 10.9404 64 89.1034 114 186.4054 
15 12.1165 65 90.9163 115 188.4661 
16 13.3206 66 92.7359 116 190.5306 
17 14.5511 67 94.5619 117 192.5988 
18 15.8063 68 96.3945 118 194.6707 
19 17.0851 69 98 .2333 119 196.7462 
20 18.3861 70 100.0784 120 198.8254 
21 19.7083 71 101.9297 121 200 . 9082 
22 21.0508 72 103.7870 122 202.9945 
23 22.4125 73 105 .6503 123 205 .0844 
24 23.7927 74 107.5196 124 207.1779 
25 25.1906 75 109.3946 125 209.2748 
26 26.6056 76 111.2754 126 211.3751 
27 28.0370 77 113.1619 127 213.4790 
28 29.4841 78 115.0540 128 215.5862 
29 30.9465 79 116.9516 129 217.6967 
30 32.4237 80 118.8547 130 219.8107 
31 33.9150 81 120.7632 131 221.9280 
32 35.4202 82 122.6770 132 224.0485 
33 36.9387 83 124.5961 133 226.1724 
34 38.4702 84 126.5204 134 228.2995 
35 40.0142 85 128.4498 135 230.4298 
36 41.5705 86 130.3843 136 232.5634 
37 43.1387 87 132.3238 137 234.7001 
38 44.7185 88 134.2683 138 236.8400 
39 46.3096 89 136.2177 139 238.9830 
40 47.9116 90 138.1719 140 241.1291 
“Al 49.5244 91 140.1310 141 243 .2783 
42 51.1477 92 142.0948 142 245 .4306 
43 52.7811 93 144.0632 143 247 .5860 
44 54.4246 94 146.0364 144 249.7443 
45 56.0778 95 148.0141 145 251.9057 
46 57.7406 96 149.9964 146 254.0700 
47 59.4127 97 151.9831 147 256.2374 
48 61.0939 98 153.9744 148 258.4076 
49 62.7841 99 155.9700 149 260 . 5808 
50 64.4831 100 157.9700 150 262.7569 
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TABLE VIII—Continued 


FOUR-PLACE LOGARITHMS OF FACTORIALS OF NUMBERS UP TO 1000—Continued 


K No. Log of factorial No. Log of factorial No. Log of factorial 

151 264.9359 201 377.2001 251 494.9093 
152 267.1177 202 379.5054 252 497.3107 
153 269 .3024 203 381.8129 253 499.7138 
154 271.4899 204 384.1226 254 502.1186 
155 273 .6803 205 386.4343 255 504.5252 
156 275.8734 206 388.7482 256 506.9334 
157 278 .0693 207 391.0642 257 509.3433 
158 280.2679 208 393.3822 258 511.7549 
159 282 .4693 209 395.7024 259 514.1682 
160 284.6735 210 398 .0246 260 516.5832 
161 286.8803 211 400.3489 261 518.9999 
162 289 .0898 212 402.6752 262 521.4182 
163 291.3020 213 405 .0036 263 523.8381 
164 293.5168 214 407 .3340 264 526.2597 
165 295 .7343 215 409 . 6664 265 528.6830 
166 297.9544 216 412.0009 266 531.1079 
167 300.1771 217 414.3373 267 533.5344 
168 302.4024 218 416.6758 268 535.9625 
169 304 .6303 219 419.0162 269 538.3922 
170 306.8608 200 421.3587 270 540.8236 
171 309 .0938 221 423.7031 271 543.2566 
172 311.3293 222 426.0494 272 545.6912 
173 313.5674 223 428.3977 273 548.1273 
174 315.8079 224 430.7480 274 550.5651 
175 318.0509 225 433.1002 275 553 
176 320.2965 226 435.4543 276 555.4453 
177 322.5444 227 437.8103 277 557 .8878 
178 324.7948 228 440.1682 278 560.3318 
179 327.0477 229 442.5281 279 562.7774 
180 329.3030 230 444.8898 280 565.2246 
181 331.5607 231 447.2534 281 567 .6733 
182 333.8207 232 449.6189 282 570.1235 
183 336.0832 233 451.9862 283 572.5753 
184 338.3480 234 454.3555 284 575.0287 
185 340.6152 235 456.7265 285 577.4835 
186 342.8847 236 459.0994 286 579.9399 
187 345.1565 237 461.4742 287 582.3977 
188 347 .4307 238 463 .8508 288 584.8571 
189 349.7071 239 466.2292 289 587.3180 
190 351.9859 240 468 . 6094 290 589.7804 
191 354.2669 241 470.9914 291 592.2443 
192 356.5502 242 473.3752 292 594.7097 
193 358.8358 243 475.7608 293 597.1766 
194 361.1236 244 478.1482 294 599.6449 
195 363.4136 245 480.5374 295 602.1147 
196 365.7059 246 482.9283 296 604 .5860 
197 368 .0003 247 485.3210 297 607 .0588 
198 370.2970 248 487.7154 298 609 . 5330 
199 372.5959 249 490.1116 299 612.0087 
200 374.8969 250 492.5096 300 614.4858 
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FOUR-PLACE LOGARITHMS OF FACTORIALS OF NUMBERS UP TO 1000—Continued 
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No. Log of factorial No. Log of factorial No. Log of factorial 
301 616.9644 351 742 .6373 401 871.4096 
302 619.4444 352 745.1838 402 874.0138 
303 621.9258 353 747.7316 403 876.6191 
304 624.4087 354 750.2806 404 879.2255 
305 626.8930 355 752.8308 405 881.8329 
306 629.3787 356 755.3823 406 884.4415 
307 631.8659 357 757.9349 407 887.0510 
308 634.3544 358 760.4888 408 889.6617 
309 636.8444 359 763 .0439 409 892.2734 
310 639 .3357 360 765 .6002 410 894. 8862 
311 641.8285 361 768.1577 411 897.5001 
312 644.3226 362 770.7164 412 900.1150 
313 646.8182 363 773.2764 413 902.7309 
314 649.3151 364 775.8375 414 905 .3479 
315 651.8134 365 778.3997 415 907 .9660 
316 654.3131 366 780.9632 416 910.5850 
317 656.8142 367 783.5279 417 913.2052 
318 659.3166 368 786.0937 418 915.8264 
319 661.8204 369 788 .6608 419 918.4486 
320 664.3255 370 791.2290 420 921.0718 
321 666.8320 371 793.7983 421 923.6961 
322 669 . 3399 372 796.3689 422 926.3214 
323 671.8491 373 798 .9406 423 928.9478 
324 674.3596 374 801.5135 424 931.5751 
325 676.8715 375 804.0875 425 934.2035 
326 679.3847 376 806.6627 426 936.8329 
327 681.8993 377 809 . 2390 427 939.4633 
328 684.4152 378 811.8165 428 942.0948 
329 686.9324 379 814.3952 429 944.7272 
330 689.4509 380 816.9749 430 947 .3607 
331 691.9707 381 819.5559 431 949 .9952 
332 694.4918 382 822.1379 432 952.6307 
333 697 .0143 383 824.7211 433 955.2672 
334 699 . 5380 384 827.3055 434 957.9047 
335 702.0631 385 829.8909 435 960.5431 
336 704.5894 386 832.4775 436 963.1826 
337 707.1170 387 835.0652 437 965 .8231 
338 709 .6460 388 837.6540 438 968 .4646 
339 712.1762 389 840.2440 439 971.1071 
340 714.7076 390 842.8351 440 973.7505 
341 717.2404 391 845.4272 441 976.3949 
342 719.7744 392 848 .0205 442 979.0404 
343 722.3097 393 850.6149 443 981.6868 
344 724.8463 394 853.2104 444 984.3342 
345 727.3841 395 855.8070 445 986.9825 
346 729.9232 396 858.4047 446 989.6318 
347 732.4635 397 861.0035 447 992.2822 
348 735.0051 398 863 .6034 448 994 .9334 
349 737.5479 399 866. 2044 449 997 .5857 
350 740.0920 400 868 . 8064 450 1000 . 2389 
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TABLE VIII1—Continued 


FOUR-PLACE LOGARITHMS OF FACTORIALS OF NUMBERS UP TO 1000—Continued 


No. Log of factorial No. Log of factorial No. Log of factorial 
451 1002 .8931 501 1136.7862 551 1272.8480 
452 1005 .5482 502 1139.4870 552 1275.5899 
453 1008 . 2043 503 1142.1885 553 1278 .3327 
454 1010.8614 504 1144.8909 554 1281 .0762 
455 1013 .5194 505 1147 .5942 555 1283 .8205 
456 1016.1783 506 1150. 2984 556 1286.5655 
457 1018. 8383 507 1153.0034 557 1289 .3114 
458 1021.4991 508 1155.7093 558 1292 .0580 
459 1024. 1609 509 1158.4160 559 1294. 8054 
460 1026.8237 510 1161 .1236 560 1297 .5536 
461 1029 .4874 511 1163 .8320 561 1300 3026 
462 1032.1520 512 1166.5412 562 1303 .0523 
463 1034 .8176 513 1169.2514 563 1305 . 8028 
464 1037 .4841 514 1171.9623 564 1308 .5541 
465 1040.1516 515 1174.6741 565 1311.3062 
466 1042 .8200 516 1177.3868 566 1314.0590 
467 1045 .4893 517 1180. 1003 567 1316.8126 
468 1048 .1595 518 1182.8146 568 1319. 5669 
469 1050 . 8307 519 1185 .5298 569 1322 .3220 
470 1053 .5028 520 1188. 2458 570 1325.0779 
471 1056.1758 521 1190.9626 571 1327 .8345 
472 1058 .8498 522 1193 .6803 572 1330.5919 
473 1061 .5246 523 1196.3988 573 1333 .3501 
474 1064 . 2004 524 1199.1181 574 1336. 1090 
475 1066.8771 525 1201 8383 575 1338 8687 
476 1069 .5547 526 1204. 5593 576 1341.6291 
477 1072 .2332 527 1207.2811 577 1344 .3903 
478 1074.9127 528 1210.0037 578 1347.1522 
479 1077 .5930 529 1212.7272 579 1349 .9149 
480 1080. 2742 530 1215 .4514 580 1352.6783 
481 1082 .9564 531 1218.1765 581 1355 .4425 
482 1085 .6394 $32 1220.9024 582 1358. 2074 
483 1088 .3234 533 1223 .6292 583 1360 .9731 
484 1091 .0082 534 1226.3567 584 1363 .7395 
485 1093 .6940 535 1229.0851 585 1366. 5066 
486 1096 .3806 536 1231.8142 586 1369. 2745 
487 1099 .0681 537 1234 .5442 587 1372 .0432 
488 1101.7565 538 1237 .2750 588 1374.8126 
489 1104 .4458 539 1240 .0066 589 1377 .5827 
490 1107 . 1360 540 1242.7390 590 1380. 3535 
491 1109 .8271 541 1245 .4722 591 1383.1251 
492 1112.5191 542 1248. 2062 592 1385 .8974 
493 1115.2119 543 1250.9410 593 1388 .6705 
494 1117 .9057 544 1253 .6766 594 1391 .4443 
495 1120.6003 545 1256.4130 595 1394.2188 
496 1123.2958 546 1259.1501 596 1396 .9940 
497 1125.9921 547 1261 .8881 597 1399 .7700 
498 1128. 6893 548 1264. 6269 598 1402 .5467 
499 1131.3874 549 1267 .3665 599 1405 .3241 
500 1134.0864 550 1270. 1069 600 1408. 1023 
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FouR-PLACE LOGARITHMS OF FACTORIALS OF NUMBERS UP TO 1000—Continued 
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No. Log of factorial No. Log of factorial No. Log of factorial 
601 1410.8812 651 1550.7215 701 1692 .2299 
602 1413 .6608 652 1553 .5357 702 1695 .0762 
603 1416.4411 653 1556. 3506 703 1697 .9232 
604 1419.2221 654 1559. 1662 704 1700 .7708 
605 1422 .0039 655 1561 .9824 705 1703 .6190 
606 1424 .7863 656 1564. 7993 706 1706 .4678 
607 1427 .5695 657 1567 .6169 707 1709 .3172 
608 1430.3534 658 1570.4351 708 1712.1672 
609 1433 .1380 659 1573.2540 709 1715.0179 
610 1435 .9234 660 1576.0736 710 1717.8691 
611 1438 .7094 661 1578.8938 711 1720.7210 
612 1441 .4962 662 1581.7146 712 1723 .5735 
613 1444. 2836 663 1584.5361 713 1726.4266 
614 1447 .0718 664 1587 .3583 714 1729. 2803 
615 1449. 8607 665 1590.1811 715 1732.1346 
616 1452 .6503 666 1593 .0046 716 1734 .9895 
617 1455 .4405 667 1595 .8287 717 1737 .8450 
618 1458.2315 668 1598 .6535 718 1740.7011 
619 1461 .0232 669 1601 .4789 719 1743 .5578 
620 1463 .8156 670 1604 .3050 720 1746 .4152 
621 1466 .6087 671 1607 .1317 721 1749 .2731 
622 1469 .4025 672 1609 .9591 722 1752.1316 
623 1472.1970 673 1612.7871 723 1754.9908 
624 1474 .9922 674 1615.6158 724 1757 .8505 
625 1477 .7880 675 1618.4451 725 1760.7109 
626 1480 .5846 676 1621.2750 726 1763 .5718 
627 1483 .3819 677 1624. 1056 727 1766 .4333 
628 1486.1798 678 1626.9368 728 1769 .2955 
629 1488 .9785 679 1629. 7687 729 1772.1582 
630 1491.7778 680 1632 .6012 730 1775 .0215 
631 1494 .5779 681 1635 .4344 731 1777 .8854 
632 1497 .3786 682 1638. 2681 732 1780.7499 
633 1500. 1800 683 1641.1026 733 1783 .6150 
634 1502 .9821 684 1643 .9376 734 1786 .4807 
635 1505. 7849 685 1646.7733 735 1789 .3470 
636 1508 .5883 686 1649 . 6096 736 1792 .2139 
637 1511.3924 687 1652 .4466 737 1795 .0814 
638 1514.1973 688 1655 .2842 738 1797 .9494 
639 1517.0028 689 1658.1224 739 1800 .8181 
640 1519.8090 690 1660 .9612 740 1803 .6873 
641 1522.6158 691 1663 . 8007 741 1806 .5571 
642 1525 .4233 692 1666 .6408 742 1809 .4275 
643 1528.2316 693 1669 .4816 743 1812.2985 
644 1531.0404 694 1672.3229 744 1815.1701 
645 1533 .8500 695 1675. 1649 745 1818 .0423 
646 1536 .6602 696 1678 .0075 746 1820.9150 
647 1539.4711 697 1680.8508 747 1823 .7883 
648 1542.2827 698 1683 .6946 748 1826. 6622 
649 1545 .0950 699 1686.5391 749 1829. 5367 
650 1547 .9079 700 1689 . 3842 750 1832.4118 
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TABLE VIII—Continued 


FouR-PLACE LOGARITHMS OF FACTORIALS OF NUMBERS UP TO 1000—Continued 


No. Log of factorial No. Log of factorial No. Log of factorial 
751 1835 .2874 801 1979.7907 851 2125 .6495 
752 1838 . 1636 802 1982 .6949 852 2128.5800 
753 1841 .0404 803 1985 .5996 853 2131.5109 
754 1843 .9178 804 1988 .5049 854 2134.4424 
755 1846.7957 805 1991 .4107 855 2137 .3744 
756 1849 .6742 806 1994 .3170 856 2140. 3068 
757 1852 .5533 807 1997 .2239 857 2143 .2398 
758 1855 .4330 808 2000 . 1313 858 2146.1733 
759 1858. 3133 809 2003 .0392 859 2149. i073 
760 1861.1941 810 2005 .9477 860 2152 .0418 
761 1864.0755 811 2008 . 8567 861 2154.9768 
762 1866 .9574 812 2011. 7663 862 2157 .9123 
763 1869 .8399 813 2014 .6764 863 2160. 8483 
764 1872 .7230 814 2017 .5870 864 2163 .7848 
765 1875 .6067 815 2020 .4982 865 2166.7218 
766 1878 .4909 816 2023 .4099 866 2169.6594 
767 1881.3757 817 2026.3221 867 2172.5974 
768 1884.2611 818 2029 . 2348 868 2175 .5359 
769 1887 . 1470 819 2032. 1481 869 2178.4749 
770 1890 .0335 820 2035 .0619 870 2181.4144 
771 1892 .9205 821 2037 .9763 871 2184.3545 
772 1895 . 8082 822 2040 .8911 872 2187 .2950 
773 1898 .6963 823 2043 .8065 873 2190. 2360 
774 1901.5851 824 2046.7225 874 2193.1775 
775 1904.4744 825 2049 .6389 875 2196.1195 
776 1907 .3642 826 2052 .5559 876 2199 .0620 
777 1910. 2547 827 2055 .4734 877 2202 .0050 
778 1913. 1456 828 2058 .3914 878 2204 .9485 
779 1916.0372 829 2061 .3100 879 2207 .8925 
780 1918 .9293 830 2064. 2291 880 2210.8370 
781 1921 .8219 831 2067 . 1487 881 2213.7820 
782 1924.7151 832 2070 .0688 882 2216.7274 
783 1927 .6089 833 2072 .9894 883 2219.6734 
784 1930. 5032 834 2075 .9106 884 2222 .6198 
785 1933. 3981 835 2078 .8323 885 2225 .5668 
786 1936. 2935 836 2081 .7545 886 2228 .5142 
787 1939. 1895 837 2084 .6772 887 2231.4621 
788 1942 .0860 838 2087 .6005 888 2234 .4106 
789 1944 .9831 839 2090. 5242 889 2237 .3595 
790 1947 .8807 840 2093 .4485 890 2240. 3088 
791 1950.7789 841 2096 .3733 891 2243 .2587 
792 1953 .6776 842 2099 . 2986 892 2246.2091 
793 1956.5769 843 2102.2244 893 2249 .1599 
794 1959 .4767 844 2105. 1508 894 2252.1113 
795 1962.3771 845 2108 .0776 895 2255 .0631 
796 1965 .2780 846 2111.0050 896 2258 .0154 
797 1968 .1794 847 2113.9329 897 2260 . 9682 
798 1971.0814 848 2116.8613 898 2263 .9215 
799 1973 .9840 849 2119.7902 899 2266. 8752 
800 1976.8871 850 2122.7196 900 2269 .8295 
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FOUR-PLACE LOGARITHMS OF FACTORIALS OF NUMBERS UP TO 1000—Concluded 
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No. Log of factorial No. Log of factorial No. Log of factorial 
901 2272 .7842 936 2376 .4993 971 2480 .7827 
902 2275 .7394 937 2379 .4711 972 2483 .7703 
903 2278.6951 938 2382 .4433 973 2486. 7584 
904 2281.6513 939 2385 .4159 974 2489 .7470 
905 2284.6079 940 2388 .3891 975 2492 .7360 
906 2287 .5650 941 2391. 3627 976 2495 .7255 
907 2290 .5226 942 2394 . 3367 977 2498 .7154 
908 2293 .4807 943 2397 .3112 978 2501.7057 
909 2296 .4393 944 2400 . 2862 979 2504 .6965 
910 2299 .3983 945 2403 .2616 980 2507 .6877 
911 2302 .3579 946 2406. 2375 981 2510 .6794 
912 2305 .3179 947 2409 . 2139 982 2513.6715 
913 2308 .2783 948 2412.1907 983 2516. 6640 
914 2311.2393 949 2415 .1679 984 2519 .6570 
915 2314.2007 950 2418. 1457 985 2522 .6505 
916 2317 .1626 951 2421.1238 986 2525 .6443 
917 2320.1250 952 2424.1025 987 2528 .6387 
918 2323 .0878 953 2427 .0816 988 2531.6334 
919 2326.0511 954 2430 .0611 989 2534 .6286 
920 2329 .0149 955 2433 .0411 990 2537 .6242 
921 2331.9792 956 2436 .0216 991 2540. 6203 
922 2334 .9439 957 2439 .0025 992 2543 .6168 
923 2337 .9091 958 2441 .9839 993 2546.6138 
924 2340 .8748 959 2444 .9657 994 2549 .6112 
925 2343 .8409 960 2447 .9479 995 2552 .6090 
926 2346 .8075 961 2450 .9307 996 2555 .6073 
927 2349.7746 962 2453 .9138 997 2558. 6059 
928 2352.7421 963 2456.8975 998 2561 .6051 
929 2355.7102 964 2459 .8815 999 2564 .6046 
930 2358 .6786 965 2462 .8661 1000 2567 .6046 
931 2361 .6476 966 2465 .8511 

932 2364 .6170 967 2468 . 8365 

933 2367 . 5869 968 2471.8224 

934 2370.5572 969 2474. 8087 

935 2373 .5281 970 2477 .7954 
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Note-—For Graphs 1-6 (Figs. 6-11) see following inserts. 
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THE TREATMENT OF PHOSGENE POISONING WITH 
TRACHEOTOMY AND SUCTION! 


By R. A. Waupb? AND RutH HorRNER?® 


Abstract 


A series of dogs poisoned by exposure to phosgene gas were treated by 
tracheotomy and suction applied by insertion of a poe ol deep into the trachea. 
By this method of treatment, animals were brought out of a moribund condition 
and the survival time was definitely increased. he survival rate, however, was 
essentially the same as that of the control animals. Although digitalis augments 
the heart beat, clears up irregularities, and keeps the heart beating for a time 
after respiration has failed, it may be detrimental by increasing the effusion of* 
the fluid into the lung rather than decreasing it. The higher the pregassing con- 
centration of haemoglobin the smaller the amount of fluid that effused into the 
lungs and the greater was the chance of survival. In practically all animals the 
point at which haemoconcentration reached a maximum was between 22 and 
23 gm. of haemoglobin per 100 mils of blood. On post-mortem examination of 
dogs dying from phosgene poisoning the stomach was found to be markedly 
distended with gas and the vessels congested. This adds to the embarrassment 
of respiration and circulation. The big problem in the treatment of phosgene 
poisoning is still the finding of some substance or means of limiting or preventing 
the effusion of fluid into the airways of the lung. 


Introduction 


After observing a great number of animals dying from phosgene poisoning, 
one is impressed with the part played by the accumulation of liquid in the 
airways of the lungs and thus causing the animal to drown in its own fluids. 
Haemoconcentration occurs regularly in dogs poisoned with phosgene (2). 
Rabbits are able to keep their blood diluted presumably by withdrawal 
of fluid from the tissues. However this maintenance of a nearly normal 
haemoconcentration does not prevent the death of these animals. It is 
unlikely, therefore, that haemoconcentration is the primary cause of death. 
In the dog, fluid is lost from the blood and at the same time, it accumulates 
in the lungs. Therefore, effusion into the lungs should eventually be limited 
by a certain degree of haemoconcentration. 


Because of the paramount part apparently played by the fluids in the 
lungs in the cause of death, it was decided to direct our treatment toward the 
removal and possible prevention of the accumulation of this fluid in the lungs. 
It was, therefore, decided to treat a series of dogs poisoned with phosgene by 
performing tracheotomy and removing as much fluid as possible by suction. 


Methods 


Dogs were gassed in a static chamber containing phosgene in a concen- 
tration of 0.80 mgm. per liter. Haemoglobin estimations were made imme- 


1 Manuscript received June 12, 1947. 


Contribution from the Department of Pharmacology, University of Western Ontario, 
London, Ont. 


2 Professor of Pharmacology. 
3 Research Assistant. 
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diately before gassing and at regular intervals following. In a preliminary 
experiment, the dogs were exposed to the gas for a period of 15 min. This 
exposure, however, was found to be insufficient to kill most of the animals. 
The time of exposure was, therefore, increased to 20 min. Thirty-two dogs 
were gassed for the longer period, 17 were treated, and 15 used as controls. 
After being gassed, the animals were observed continuously day and night 
until they died or the crisis was over. 


After removal from the tank, each animal was kept under observation until 
signs of pulmonary involvement had developed to the point where the animal 
was coughing up frothy fluid and in most cases practically in a dying state. 
Tracheotomy was then performed according to Jackson’s technic (1, pp. 382- 
405) and a tracheal cannula, with two openings to the outside, inserted. If 
the animal was conscious procaine anaesthesia was used. A No. 18 French 
rubber catheter attached to a filter pump was then passed into the trachea 
to the level of bifurcation. A Woulff bottle was connected between the 
catheter and the filter pump to collect the pulmonary fluid. Sufficient nega- 
tive pressure was used to keep the fluid flowing through the tube. As lying 
on its side interferes with the maximum expansion of the animal’s chest and 
proper drainage, the animal was kept in the sitting position. In order to 
keep as much as possible of the airway clear, the position of the end of the 
catheter was frequently changed and was kept in the animal until there was 
little or no fluid coming through. When it became certain that the wet stage 
had passed, the tracheal cannula was removed but the tracheal opening was 
not closed. 


In a previous report (3), some evidence was presented that suggested that 
the intravenous administration of theophylline ethylenediamine has a bene- 
ficial effect in some cases of phosgene poisoning. Injections of this drug were, 
therefore, given during the acute wet stage. 


Atropine, because of its stimulating effect on the respiratory center, and its 
effects on vagal innervation of the glands and smooth muscle of the bronchi, 
was injected into the treated animals. Epinephrine was also injected to 
dilate the bronchi and decrease the secretions. ' 


As the heart became very rapid and irregular during the stage of acute 
pulmonary oedema, an attempt was made to control this with digitalis. 


Water was given ad libitum but very little was taken by the animals until 
the wet stage had passed. Oxygen was given as soon as the airway was 
cleared of fluid. 

Results 


Often the immediate results have been dramatic. In animals that have 
become deeply cyanosed and moribund, in a short time after instituting 
suction, the mucous membranes became pink, the respiration changed from 
irregular gasps to a deep vigorous type, and the reflexes returned. The flow 
of fluid in some animals was not continuous but was interrupted by periods of 
relatively no flow. Later the periods of flow shortened and finally suction 
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was not necessary. In other animals there seemed to be little decrease in 
flow over a period of hours. The amount of fluid removed from a lung varied 
from 100 mils to 450 mils and contained a considerable number of blood cells. 
The time of occurrence and duration of each gush varied in each animal. 
During a gush the greater part of the respiratory tree becomes filled with 
frothy fluid, which if not removed by suction, caused acute asphyxia of the 
Even after the wet stage is passed, thick tenacious material may 


animal. 
plug the trachea and if not removed by suction or by coughing will asphyxiate 
the animal. 
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Fic. 1. Curves showing haemoglobin values of dogs surviving past the wet stage. 
at which suction treatment was started is indicated by arrow. 

The tracheotomy alone was found to facilitate removal of fluid from the 
trachea and bronchi. It shortened the airway and overcame any obstruction 
that might be present in the larynx. It also made possible the expulsion of a 
considerable amount of the frothy fluid and tenacious material by the animal 
itself, either by coughing or by vomiting-like movements. — 

Theophylline increased the depth of respiration and in that way improved 
the condition of the animal for the time being at least. Under the conditions 


it was used, it would be difficult to determine whether or not it increased the 
No evidence of diminution in pulmonary oedema 


The time 


survival time of the animals. 


fluid was noted. 
Atropine also caused marked respiratory stimulation and for a time improved 


the pulmonary ventilation, but there was evidence that a depression followed 
that could not be overcome by the administration of more atropine. There 
was no evidence that atropine diminished the pulmonary oedema fluid. 
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irregularities. Inalldigitalized dogs that died, respiration failed before the heart. 


Following gassing, all the animals showed considerable haemoconcentration. 


Digitalis increased the force of the heart beat, slowed the rate, and removed the 


The time at which the maximum concentration was reached varied from 6 to 36 
The maximum haemoconcen- 


hours, after gassing. The average was 16 hr. 

tration varied from 126% to 187.5% of normal. 

haemoglobin contained in 100 mils of blood of each animal before gassing was 
The average normal level for the treated 


The number of grams of 


taken as the normal for that animal. 
while the average for those animals that were treated and that ultimately 
In those 


died was 15.0 gm. per 100 mils of blood. The average maximum concen- 
tration reached in the treated animals that survived was 22.2 gm. 
in which we were able to prolong life but that ultimately died, it was 22.1 gm. 


of haemoglobin per 100 mils of blood. Graphic representations of the per- 
centage variations of the haemoglobin values together with the time are shown 


animal that survived was 16.5 gm. of haemoglobin per 100 mils of blood 


in the graphs (Figs. 1 and 2). 
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Curves showing haemoglobin values of dogs dying during wet stage. 


Fic. 2. 
at which suction treatment was started is indicated by arrow. 
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In the 20 min. series 32 dogs were gassed. Fifteen of these were kept as 
controls and 17 were treated. Two of the treated animals, after being brought 
out of a moribund condition by tracheotomy, and suction, died later un- 
attended. Of the remaining 15 animals, three were complete survivals. Also 
three of the control animals survived. 

The survival time of the 14 treated animals that died was prolonged in 
most cases from 3 to 10 hr. One dog survived 116 hr. after tracheotomy and 
finally died of pneumonia. One lung was consolidated. Another dog sur- 
vived 27 hr. after tracheotomy but, when apparently in very good condition, 
died of over-exertion when an attempt was made to administer a capsule con- 
taining sulfathiazole. Although this dog was almost moribund, it was not 
in coma when the tracheotomy was done. 


Discussion 


In poisoning with phosgene, damage at the alveolar wall results in an 
effusion of a serum-like material that tends to fill up the alveoli and work into 
the bronchioles, bronchi, and trachea. Here it is mixed with secretions 
provoked by the irritating gas. 

Variability in the response of individual animals of the same species to an 
identical concentration and time of exposure to phosgene is now well recognized. 
In addition to the variations in the physical condition of the animal there 
are variations in the degree of excitement and activity, the depth, rate, 
volume, and type of respiration. These factors in turn govern the amount of 
phosgene entering the lung. To some degree they can be overcome by 
allowing the animal to remain a sufficient length of time in the tank before 
adding the gas. The temperature and humidity are also factors. In these 
experiments, therefore, in addition to the use of control animals,. treatment 
was not instituted until we felt that the animal was in a dying condition. 
Although this put the treatment at a disadvantage we feel that any prolonga- 
tion of life was without doubt due to the treatment. 

Tracheotomy and suction, temporarily at least, will prevent an animal in 
an initial acute pulmonary oedema from drowning in its own fluids but not 
from further accumulation of fluid. In those animals in which the dose is not 
too large and there is a tendency for the loss of fluid into the airways to cease, 
it is possible that the treatment may effect a cure. 


Neither theophylline ethylenediamine, atropine, nor epinephrine reduced 
the oedema flow. Isotonic saline solution injected intravenously during the 
wet stage increased the flow of fluid into the lungs and did not lower the haemo- 
concentration. After the wet stage had passed, the animals drank consider- 
able water voluntarily. This was not followed by increased moisture in the 
lungs but by a lowering of the haemoconcentration. Apparently the animal 
at this stage is able to retain fluids in the blood stream. 


172 CANADIAN JOURNAL OF RESEARCH. VOL. 26, SEC. E. 


There is evidence in these experiments that dogs with a high initial haemo- 
globin value are more likely to survive. The average initial haemoglobin 
concentration for those animals that passed the wet stage was 16.5 gm. per 
100 mils of blood while the average for those that failed to pass this stage 
was 14.6 gm. Haemoconcentration in nearly all of the animals ceased when 
the values reached between 22 and 23 gm. per 100 mils of blood. It would 
appear that the nearer the initial value is to the maximum, the less fluid 
will escape into the lungs to embarrass the respiration. 


Digitalis controlled the irregularities and rate of the heart and kept it 
beating for some time after the respiratory center had failed. This was only 
accomplished after a proper dose was worked out on phosgene poisoned dogs. 
The total therapeutic dose of digoxin was found to be 0.075 mgm. per kgm. 
This was given intramuscularly in three divided doses, at four-hour intervals. 
One-half the quantity was given in the first dose and the remaining half further 
divided. 

Although the animals drown in their own pulmonary fluid, the length of 
survival or recovery depends, to some degree, on the length of time the heart 
can maintain adequate circulation in the respiratory center and other vital 
organs. However, once the respiratory fluid has accumulated to the point 
where it prevents contact of the outside air with the blood, it becomes im- 
possible for the heart, no matter what its condition, to maintain an adequate 
partial pressure of oxygen in its own muscle, in the respiratory center and 
other vital organs. 


In dogs dying of phosgene poisoning we were impressed with the amount 
of distention of the stomach with gas that was found to be present on opening 
the abdomen. This distension was so great in most animals that movements 
of the diaphragm were markedly curtailed and the pressure on the heart 
must have caused it considerable embarrassment. The distension could 
be caused by air swallowed as a result of vigorous respiratory effect. On the 
other hand, the origin of the gas may have been the blood stream. In this 
connection, the vessels on the surface of the stomach were found to be markedly 
congested and hemorrhagic in practically all animals. In addition, the partial 
pressure of carbon dioxide in the blood is high because its escape through 
the lung has been partially cut off. Both of these factors may combine 
to produce the distension. Tracheotomy was not a factor since the distension 
occurred in both treated and nontreated animals. The gas has not been 
analyzed. 


The importance of absolute rest and freedom from excitement in recovery 
from phosgene poisoning cannot be overemphasized. Often a respiratory 
gush was seen to have been initiated by slight excitement of the animal or a 
struggle. 
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ETUDE DES PROPRIETES PHARMACOLOGIQUES DE 
L’ANNOTININE ET DE LA LYCOPODINE'! 


Par Guy MARIER? ET RICHARD BERNARD*® 


Sommaire 


Les auteurs du présent travail ont étudié l’annotinine et la lycopodine, alca- 
loides extraits respectivement de Lycopodium annotinum et de Lycopodium 
clavatum, au point de vue pharmacologique. La dose léthale moyenne de ces 
deux alcaloides, calculée a la suite d’injections intrapéritonéales chez les souris, 
est de 166. 2 mg. /kg. pour l’annotinine et de 78.41 + 1.06 mg./kg. pour la lycopo- 
dine. Ces alcaloides provoquent, 4 doses toxiques, des convulsions toniques 
et cloniques, de l’asphyxie et de la paralysie chez la grenouille, la souris et le 
lapin. Cette paralysie, dans le cas de |’annotinine, semble affecter surtout les 
membres postérieurs. A doses subléthales, ces alcaloides provoquent l’aug- 
mentation de l’amplitude des mouvements respiratoires et la diminution de 
fréquence de ces mouvements. Chez le lapin, l’annotinine provoque une hyper- 
tension, une vasodilatation périphérique, un myosis prononcé de l'oeil, une 
hyperglycémie importante causée par une décharge d’adrénaline, une hypo- 
thermie qui affecte non seulement la température normale, mais encore la fiévre, 
et une inversion de l’accident T sur |’électrocardiogramme. L’annotinine et la 
lycopodine ont une action inotrope positive sur le coeur de grenouille et une 
action inotrope négative sur le coeur de lapin. L’annotinine provoque la 
diminution de l’amplitude de la contraction intestinale et l’augmentation de celle 
de l’utérus. La lycopodine, au contraire, augmente l’amplitude de la contrac- 
tion intestinale et utérine. Ce méme alcaloide cause une hypothermie lus faible 
et de moindre durée que celle causée par l’annotinine. I! n’a aucune in uence sur 
le glycémie et ne provoque pas de myosis. L’annotinine et la lycopodine n’ont 
aucune influence sur les globules sanguins et n’ont aucun pouvoir antibiotique. 


Introduction 


Les Lycopodes sont des Ptéridophytes de la famille des Lycopodiacées. Ce 
groupe, largement répandu dans la Province de Québec, renferme un grand 
nombre d’alcaloides qui, pour la plupart, ont été étudiés au point de vue 
chimique par Manske et Marion (10-17), du Conseil National des Recherches 
du Canada. Ces deux auteurs ont isolé, 4 partir des différentes espéces de 
Lycopodes, plus d’une trentaine d’alcaloides nouveaux, dont l’annotinine. 
Ils ont trouvé, de plus, quelques alcaloides connus, dont la lycopodine. 

On connait peu de choses sur la structure chimique de ces deux alcaloides 
que sont l’annotinine et la lycopodine, Manske et Marion (11) donnent a 
l’annotinine la formule brute suivante: CisH2O3;N avec point de fusion a 
232°C. La lycopodine isolée pour la premiére fois par Bédeker en 1881 (3), 
puis de nouveau par Achmatowicz (1) en 1938 et Manske et Marion (10) 
en 1943, a la formule brute suivante: CisH2ON avec point de fusion 4 115- 
116°C. Manske et Marion (14) ont émis l’hypothése que la lycopodine 


posséde un noyau quinoléique hydrogéné et que l’oxygéne y est présent sous 
forme d’ester cyclique. 


1 = Manuscrit regu le 17 novembre 1947. 
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Au point de vue pharmacologique, les travaux d’importance qui ont été 
publiés sur le sujet sont ceux d’Oficjalski (19) sur la toxicité des alcaloides de 
Lycopodes (préparations galéniques), de Nikinorow (18) sur les propriétés 
antipyrétiques de ces alcaloides (préparations galéniques), d’Achmatowicz (1) 
sur la lycopodine, laquelle cause une paralysie du systéme nerveux central et 
périphérique chez la grenouille, ainsi qu’une action tonique sur la respiration. 
Finalement, en 1945, Lee et Chen (9) rapportent que l’annotinine et la lyco- 
podine, 4 doses appropri¢es, causent de l’incoordination musculaire et de la 
paralysie chez la grenouille. A faibles doses, ces deux alcaloides font aug- 
menter la pression artérielle du chat. Toujours, d’aprés Lee et Chen, la 
lycopodine provoque la contraction de l’intestin et de l’utérus isolés. 


Partie experimentale 
Extraction 

L’extraction de |’annotinine a été faite a partir de Lycopodium annotinum 
selon la méthode décrite par Manske et Marion (10 et 11). L’annotinine, 
telle qu’extraite a un point de fusion 4 232°C., a été transformée pour fin 
d’injection en chlorhydrate d’annotinine avec point de fusion 4 285°C. La 
lycopodine a été extraite de Lycopodium clavatum selon une méthode basée sur 
celle de Marion (10 et 11) pour les alcaloides totaux et sur celle d’Achmatowicz 
(1) pour la séparation et la purification de la lycopodine; extrait ainsi, l’alca- 
loide a son point de fusion 4 115-116° C. Le point de fusion du chlorhydrate 
de lycopodine est 4 358° C. 

Toxicité 

Nous avons procédé a la détermination de la dose léthale moyenne (D.L. 50) 
a la suite d’injection intrapéritonéales chez la souris. Cette détermination 
pour l’annotinine a été faite d’aprés la méthode de Behrens (4) et a-donné 
la valeur suivante: 166.2 mg./kg. La dose léthale moyenne de la lycopodine, 
calculée d’aprés la méthode de Karber (4) a atteint 78.41 + 1.06 mg./kg. 

Au cours de l’injection de doses toxiques de l’un ou de I’autre alcaloide, 
nous avons observé des réactions semblables chez la grenouille: l’annotinine 
(a la dose de 300 4 450 mg. /kg.) et la lycopodine (a la dose de 50 4 200 mg./kg.), 
injectées dans le sac lymphatique dorsal provoquent de l’incoordination muscu- 
laire évidente surtout aux pattes postérieures, et de la paralysie assez com- 
pléte, allant souvent jusqu’a faire croire 4 la mort de l’animal. Chez la 
souris, l’injection intrapéritonéale d’annotinine (150-250 mg./kg.) et de 
lycopodine (50-160 mg./kg.) provoque l’hyperexcitabilité, de convulsions 
toniques et cloniques, de l’asphyxie et de la paralysie. La’paralysie provoquée 
par l’annotinine semble se localiser aux pattes postérieures. Chez le lapin: 
injection intraveineuse de lycopodine (70-200 mg./kg.) et d’annotinine 
(150-250 mg./kg.) provoque de l’hyperexcitabilit¢é, des convulsions et de 
l’asphyxie. L’annotinine provoque de plus une paralysie des membres 
postérieurs et un myosis de la pupille. La lycopodine provoque la défécation 
et une vasodilatation périphérique beaucoup plus prononcée que celle causée 
par l’injection d’annotinine. 
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Circulation 

Pression artérielle et respiration 

L’annotinine en injection intraveineuse (30-100 mg./kg.) provoque chez le 
lapin anesthésié a l’uréthane, immédiatement aprés l’injection, une hypo- 
tension trés rapide et de courte durée, puis une hypertension qui se prolonge 
pendant un temps indéterminé. Cette hypertension varie de 30 4 53 milli- 
métres de mercure. La vagotomie et |’injection subséquente de |’alcaloide 
n’aménent pas de résultats différents. 

Chez le chat et chez le chien, l’annotinine, aux mémes doses, provoque une 
chute prononcée de la pression artérielle, puis une élévation graduelle de la 
pression jusqu’A un niveau encore en-dessous de la pression normale. Cette 
hypotension se maintient pendant un temps indéterminé. 


La lycopodine injectée par voie intraveineuse 4 la dose de 30 mg./kg. cause 
tant chez le lapin que chez le chat une hypotension maxima immédiatement 
aprés l’injection. Par la suite, la pression s’éléve un peu, tout en demeurant 
plus basse que la pression normale, et ce pendant un temps indéterminé. 

L’effet de l’annotinine et de la lycopodine semble additif et proportionnel a 
la dose donnée. 

La respiration est quelque peu modifiée par l’annotinine et la lycopodine. 
La fréquence respiratoire diminue et l’amplitude augmente. 

Coeur isolé et perfusé 

Les perfusions sont faites selon la méthode de Straub (21) pour les coeurs 
d’animaux 4 sang froid et celle de Langendorff (7) pour les coeurs d’animaux 
a sang chaud. 

Les coeurs de grenouilles sont perfusés avec des solutions de lycopodine 4 
1: 200,000. Au début, la lycopodine a un effet inotrope positif. Par la 
suite, l’action inotrope a tendance 4 devenir négative. Sur les coeurs isolés 
de lapins, la lycopodine (1 : 50,000 4 1 : 400,000) a une action inotrope négative 
prononcée. Dans les deux cas, la syonpotine a ne semble pas affecter sensible- 
ment la fréquence des contractions. 

L’annotinine (3 : 50,000 4 3: 200,000) a sur le coeur de grenouille une 
action inotrope positive avec une tendance au chronotropisme négatif. Sur 
les coeurs de lapins, cet alcaloide (1 : 25,000 4 1 : 200,000) a une action inotrope 
négative avec tendance au chronotropisme négatif. 

On voit donc que les deux alcaloides ont, sur le coeur de grenouille d’une 
part, et sur le coeur de lapin d’autre part, sensiblement la méme action. 

Electrocardiogramme 

La lycopodine, méme a des doses de 30 mg./kg., n’affecte pas l’électrocardio- 
gramme du lapin. Par contre, l’annotinine, 4 la dose de 30-75 mg./kg. 
chez le lapin et de 50-100 mg./kg. chez le chat, améne la disparition ou |’inver- 
sion de l’accident T. Ce changement se produit, tant chez l’animal normal 
que chez |’animal anesthésié. II est, de plus, réversible, l’accident T rede- 
venant normal aprés quelques heures, ceci prouve que si le myocarde est 
affecté, il n’est pas endommagé. 
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Muscles lisses 


Les enregistrements des contractions intestinales et utérines sont faits 
selon la méthode de Magnus (8). 


Intestin isolé 


L’annotinine 4 la concentration de 1 : 6000 4 1 : 25,000 a été étudiée sur 
des intestins isolés de rats, cobayes, lapins et chats. A ces concentrations, 
l’alcaloide diminue |’amplitude des contractions sans en diminuer la fréquence. 
De plus, l’intestin de lapin relache son tonus. A des doses plus fortes, tous les 
mouvements automatiques sont inhibés par l’annotinine. 


La lycopodine a un effet contraire 4 celui de l’annotinine sur les intestins de 
rats et de cobayes. A des concentrations de l’ordre de 1 : 12,500.41 : 50,000, 
la lycopodine augmente |’amplitude des mouvements pendulaires de |’intestin 
sans en changer la fréquence. 


Utérus isolé 


L’annotinine et la lycopodine provoquent la contraction utérine. Si la 
contraction existe déja, les deux alcaloides |’amplifient. La concentration 
d’annotinine employée est de 1: 5000 4 1: 12,500 pour les utérus de rats, de 
cobayes et de lapins. Celle de lycopodine est de 1 : 16,500 pour les utérus de 
cobayes. 

La réaction de l’intestin et de l’utérus, a l’annotinine et A la lycopodine, 
laisse supposer que le premier alcaloide peut étre considéré comme un excitant 
sympathique et le second comme un excitant parasympathique. 


Température corporelle 


La méthode suivie pour déceler l’action de l’annotinine et de la lycopodine 
sur la température corporelle est celle dérivée de l’essai pour les substances 
pyrogénes (20). Les substances employées sont passées au filtre Seitz pour 
les débarrasser de toutes substances pyrogénes. Pour l’annotinine, 10 lapins 
servent a l’expérience, dont cing comme témoins ne recevant que du chlorure 
de sodium 9: 1000. Les cing autres recoivent en injection intraveineuse de 
l’annotinine 4 la dose de 50 mg./kg. Pour la lycopodine, 8 lapins sont 
employés a l’expérimentation, dont 4 comme témoins ne recevant que du 
chlorure de sodium a 9 : 1000; les 4 autres recoivent par voie intraveineuse 
une dose de lycopodine de 30 mg./kg. 


L’annotinine et la lycopodine font baisser la température corporelle. La 
chute due a l’annotinine est rapide et sensiblement plus prolongée que celle 
due a la lycopodine. La chute réelle de température, une heure et quart 
aprés l’injection d’annotinine est de l’ordre de 1.30°F., tandis que celle due 
a la lycopodine une heure aprés I’injection est de l’ordre de 0.78° F. 


L’annotinine agit non seulement sur la température normale mais encore 
sur la fiévre provoquée par I’injection intramusculaire de lait bouilli. Dans ce 
cas, l’hypothermie moyenne provoquée est, aprés une heure, de l’ordre de 
3.52° F., l’ordre de grandeur de la fiévre étant de 0.96° F. et la chute moyenne 
de température étant de 2.56° F. (Fig. 1). 
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Aprés quatre ou cing heures, la température est redevenue normale tant 
chez les fébricitants que chez les individus normaux ayant recu de l’annotinine. 
L’annotinine est donc un antithermisant et un fébrifuge. La lycopodine est 
aussi, mais 4 une moindre degré, un antithermisant et probablement un 
fébrifuge. 
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Fic. 1. Annotinine et fiévre. 


La lycopodine semble avoir cette propriété d’agir sur la température, non 
pas par action directe sur le centre thermorégulateur, mais bien par la vaso- 
dilatation périphérique qu’elle produit. Le centre thermorégulateur, et c’est 
ce pourquoi la température remonte si vite, entre en action pour corriger 
l’abaissement de la température provoquée par la déperdition de chaleur. 
L’annotinine agit sur la température en l’abaissant, non seulement parce 
que’elle provoque une vasodilatation mais encore parce qu’elle agit sur le 
centre thermorégulateur qui est déprimé. C’est ce qui explique que la tem- 
pérature prenne autant de temps a redevenir normale. 


Pouvoir antibiotique 


L’annotinine et la lycopodine, méme a la concentration de 1%, n’ont aucun 
pouvoir antibiotique sur des cultures de Staphylocoques dorés (Staphylococcus 
aureus), que ce soit sur bouillon nutritif ou sur gélose nutritive. 
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Numeération globulaire 


L’annotinine injectée par voie intraveineuse, 4 la dose de 50 mg./kg., n’a 
aucun effet sur le nombre des globules rouges du sang. Le nombre des 
globules blancs augmente cependant légérement mais on ne peut dire si cette 
augmentation est due a l’annotinine ou aux substances pyrogénes qu’aurait 
pu contenir la solution (2). 

Glucose sanguin 


La lycopodine a la dose de 30 mg./kg. en injection intraveineuse ne présente 
aucun effet sur la glycémie. Par contre, |’annotinine s’avére un hyperglycé- 
miant trés puissant. La méthode suivie pour l'étude de I’action de l’annotinine 
sur la glycémie est la microméthode de Folin (6), méthode que I|’on a adaptée 
a l’électrophotométre de Fisher (5). L’annotinine est injectée par voie intra- 
veineuse 4 raison de 50 mg./kg. Immédiatement prés l’injection, les lapins 
recevant de l’annotinine subissent une augmentation de glucose sanguin. 
Cette augmentation est maxima aprés trois heures et dure de sept a huit 
heures. L’augmentation moyenne maxima de glucose est de’ 102.7 mg. par 
100 cc. de sang (Fig. 2). 


L’augmentation de glucose peut étre due a |’action directe de |’annotinine 
sur la fonction glycogénique. Elle peut se faire aussi par l’intermédiaire de 
certaines glandes dont la sécrétion serait activée par l’annotinine. Ainsi, 
l’annotinine peut exciter les glandes surrénales et provoquer une décharge 
d’adrénaline dans |’organisme, laquelle décharge est responsable de l’hyper- 
glycémie. 

Pour ce prouver, nous avons surrénalectomisé totalement un groupe de 
lapins et les avons maintenus en survie au moyen d’une injection quotidienne 
d’un demi-centimétre cube d’extrait de glande surrénale (Adrenal Cortical 
Extract, don des laboratoires Connaught). L’injection subséquente d’anno- 
tinine a ces lapins n’améne plus d’hyperglycémie, mais bien une hypoglycémie 
peu prononcée quoique significative. 

L’annotinine produit donc une forte hyperglycémie provenant, non par 
action directe de l’alcaloide sur la fonction glycogénique, mais indirectement 
par l’intermédiaire de l’adrénaline. L’annotinine, en faisant abstraction de 
l’adrénaline, est un hypoglycémiant léger. 


Pupille 


L’annotinine et la lycopodine ne réagissent pas de facon semblable sur la 
pupille de l’oeil. - La lycopodine n’a aucun effet sur la pupille que ce soit a la 
suite d’instillation, d’injection intraveineuse ou sous-cutanée. Au contraire, 
injection intraveineuse d’annotinine (50-100 mg./kg.) produit un myosis 
trés prononcé qui se fait sentir presqu’immédiatement aprés l’injection et qui 
dure de deux a trois heures. Aprés une heure et demie, le myosis est maximum, 
la pupille ayant diminué de plus de moitié. L’injection sous-cutanée produit 
la méme réaction, mais celle-ci est beaucoup plus lente 4 apparaitre. 

L’instillation dans le cul-de-sac de I’oeil d’une solution d’atropine 4 1% ou 
l’injection intraveineuse d’atropine produisent par paralysie des terminaisons 
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parasympathiques de l’oculaire commun, une mydriase intense qui, cependant, 
ne peut résister a l’injection d’annotinine; la mydriase due a l’atropine dis- 
parait aussit6t pour faire place 4 un myosis prononcé. 
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Fic. 2. Annotinine et glucose sanguin. 


L’instillation dans l'oeil d’une solution d’annotinine 4 4% produit un 
myosis léger et A peine perceptible. Toutefois, l’extirpation des deux yeux 
d’un lapin et l’immersion de l’un dans une solution 4 1 pour mille, nous mon- 
trent mieux ce myosis: 1’oeil plongé dans la solution 4 4% a la pupille beaucoup 
plus contractée que l’autre. 

Nous croyons donc que |’annotinine produit ce myosis de la pupille par 
action locale sur le muscle radiaire de l’iris. Ce myosis est le fait d’un excitant 
parasympathique. 
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Discussion 


L’annotinine, bien que moins toxique que la lycopodine, se montre plus 
active, du point de vue pharmacologique que cette derniére. L’action de la 
lycopodine sur les principales fonctions de l’organisme et surtout son action 
sur les muscles lisses, en font un excitant parasympathique. On ne peut dire 
toutefois la méme chose de |’annotinine. Certains effets de cet alcaloide, 
comme son action sur les muscles lisses de |’intestin et de l’utérus, son effet sur 
la pression artérielle de lapin et méme son action sur le glycémie, en font un 
excitant sympathique. Par contre, son action sur le glucose sanguin du 
lapin surrénalectomisé et surtout son action sur la pupille, en font un excitant 
parasympathique. On semble donc en présence d’un excitant mixte agissant 
dans un sens sur certains organes, et dans un autre sens sur d’autres organes. 


L’action de la lycopodine et de l’annotinine sur le systéme nerveux central 
est de moindre importance si on la compare 4 celle qu’ont ces deux drogues 
sur le systéme nerveux autonome. 
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